Hi Claus,
before we go on, we should clarify what you are planning to do with Chemical
Inventory (CI). Some design decisions that we made for NMRShiftDB might not
neccessarily be suited for CI. The splitting of molecules into atom and bond
tables on the SQL level is probably not what you need and want and it might make
things slow.
If the main purpose is searching for structures, it will be enough to store the
molecule as, say, a CML snippet in on table, together with fingerprint and some
additional information (name, gross formula, etc.). Most structure databases
work like this. They don't split up the structure on the SQL level, they just
store information needed for prescreening.
However, I have asked my coworker Stefan to dig into the NMRShiftDB code and
show us the relevant snippets for structure searching.
BTW, did you already discover the "CDK News", our quartly newsletter. It will
have a lot of interesting articles that answer questions that you might have.
Take a look at http://almost.cubic.uni-koeln.de/cdk/cdk_top/cdk_news/ and get
the various pdf's.
Another valuable source of information is the org.openscience.cdk.test package
of CDK, where each class in CDK, such as the fingerprinter, has its test class.
These tests show, how to use and setup each of the functionalities provided by
the CDK. Take a look, for example, at the
test.fingerprint.FingerprinterTest.java file to learn how to read a molfile and
make a fingerprint from it.
Equally, test.smiles.SmilesGeneratorTest shows you more on this side.
Cheers,
Christoph
Claus Stie Kallesøe wrote:
Hi Christoph,
and thank you for the answer. happy to hear that you would like to help
us out. My idea actually was to reuse some code from NMRshiftdb
Here is the story:
Today we receive the user input structure as a molfile from Marvin. We
then parse that on to a JChemSearch object. This object (together with
an updatehandler) pretty much takes care of everything related to
substructure searching, dublicate check etc.
JChem therefore of cause has requirements to the structure table.
We would now like to try to port to first CDK and then JChemPaint to
make the chemicalinventory true opensource.
I read the journals about the design of CDK as well as NMRshiftdb and my
plan was to get inspiration for the structure tables as well as code for
searching and storing.
I have found the sql statements for the tables but I really do have a
hard time finding the code. So if you could point me in the right d!
irection that would be a great help.
I do understand that there is going to be more coding from our hands in
order to perform searches and inserts using CDK.
All we basically want to do is do a dublicate check during insert of new
structures (to keep the structure table unique (in 2D)) and be able to
perform exact match, similarity and substructure searches depending on
user choices.
Also I think I now (after reading through most of the API yestersday)
understand the concept of ChemObjects, AtomContainer, Bond and Atoms.
But I still don't see how I then break up a drawn molecule (molfile) in
order to store the fingerprints, smiles, bonds and atoms etc.
Again If you are able to point me to the code so I can see examples in
order to fully understand I think we can handle the hard work.
Another option is of cause if any of you would like to join us. I have
made a new CVS module (chemicalinventory-cdk), where I will start to
edit the current ! code to use CDK. Just let me know and I will register
you.
Thank you
claus
*/Christoph Steinbeck <[EMAIL PROTECTED]>/* skrev:
Those classes are indeed "prehistoric material" by CDK means.
I guess they have not been used or maintained for years.
Claus, if you are interested in writing structures to and from
databases, we
should discuss the issues.
As you will know, we have written quite some code for doing this
within the
NMRShiftDB project. The code is part of the NMRShiftDB code base,
which uses a
lot of CDK code.
Cheers,
Chris
Egon Willighagen wrote:
> On Wednesday 28 December 2005 16:22, Claus Stie Kallesøe wrote:
>
>> I have been reading through the cdk library here:
>>http://cdk.sourceforge.net/api/
>>
>> in order to find o! ut how I can use the library.
>>
>> As we use a MySql database to store the structures I wanted to
use the
>>DBReader.class as the describtion tells me: " Reader that can
read from a
>>relational database that can be accessed through JDBC."
>>
>> But I can't fint the class in the cdk-20050826.jar file I have?
>
>
> The DBReader, DBWriter and DBAdmin classes are in the CDK module
'orphaned',
> meaning that no one was maintaining them, and they did not seemed
to be used.
>
> The sources, however, can be found in CVS at:
>
>
http://cvs.sourceforge.net/viewcvs.py/cdk/cdk/src/org/openscience/cdk/database/
>
>
>> Under org.openscience.cdk.database I only see the XindiceReader.
>>
>> Can you tell me where I can find the DBReader?
>
>
> Note, that the DBReader and DBWriter is closely tied together,
and store the
> mol! ecules CML string in the 'molecules' table:
>
> ps = con.prepareStatement("INSERT INTO molecules VALUES('', ?)");
>
> This does likely not match the setup you are working with.
>
> Nevertheless, the source should give you some insight in how I
used MySQL in
> the past.
>
> Egon
>
--
Priv. Doz. Dr. Christoph Steinbeck ([EMAIL PROTECTED])
Head of the Research Group for Molecular Informatics
Cologne University BioInformatics Center
(http://almost.cubic.uni-koeln.de)
Zülpicher Str. 47, 50674 Cologne
Tel: +49(0)221-470-7426 Fax: +49 (0) 221-470-7786
What is man but that lofty spirit - that sense of enterprise.
... Kirk, "I, Mudd," stardate 4513.3..
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through
log files
for problems? Stop! Download the new AJAX search engine that makes
searching! your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user
--
Priv. Doz. Dr. Christoph Steinbeck ([EMAIL PROTECTED])
Head of the Research Group for Molecular Informatics
Cologne University BioInformatics Center (http://almost.cubic.uni-koeln.de)
Zülpicher Str. 47, 50674 Cologne
Tel: +49(0)221-470-7426 Fax: +49 (0) 221-470-7786
What is man but that lofty spirit - that sense of enterprise.
... Kirk, "I, Mudd," stardate 4513.3..
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Cdk-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/cdk-user