Thanks Guys.
We experimented with OCHEM, it holds promise, but keeping the CDK up-to-date
doesn't seem to be a priority and R-based random forest (apparently written by
a student) doesn't work. Also, the non-openess/licensing makes me cry. So
that's a no.
I don't have the mad skills of most people on this list, so I'll be exploring
all your suggestions over the next semester. The advice is invaluable.
Thanks,
Andrew LangProfessor of MathematicsGC1E08Oral Roberts UniversityTulsa, OK74171
| professor of mathematics | blog | flickr | youtube | friendfeed |
> Date: Sun, 4 Dec 2011 00:24:08 +0200
> Subject: Re: [BlueObelisk-discuss] How to distribute models
> From: [email protected]
> To: [email protected]
> CC: [email protected]
>
> Hi Andrew and others,
>
> > There are plenty of ways I now know to create CDK descriptors and build
> > models but I'm looking for the best way to distribute them as a desktop
> > application.
> >
>
> First of all, are you looking for an ODOSOS-spirited solution for
> distributing models or a ready to use desktop application? If you're
> more into the former (and don't mind some Java programming) then
> perhaps I may suggest you to take a look at the QsarDB project
> (http://qsardb.googlecode.com), which is a proposal for the electronic
> organization and archiving of QSAR/QSPR model information.
>
> Basically, QsarDB enables you to encapsulate a QSAR model (and all of
> its supporting information) into a single so-called QDB file. QDB
> files are easy to distribute and archive. When handled in a proper
> run-time environment they readily lend themselves to programmatic
> execution, such as making a prediction.
>
> > My dream program would be one where you would input a SMILES (GUI), it would
> > generate the CDK descriptors and then report back (user selected) predicted
> > properties based upon Open models (linear, random forest, etc) with the
> > ability to download and add new models (like you can add new functionality
> > with R). A command line option that does batches would be important too.
> >
> > Does this program exist?
> >
>
> Recently I did some research about QSAR model data formats and
> couldn't find anything major except Bioclipse's QSAR-ML data format
> (http://pele.farmbio.uu.se/qsar-ml/). Unfortunately, QSAR-ML appears
> to be limited to the representation of raw datasets (ie. chemical
> structures, property and descriptor values) and doesn't cover the rest
> of a typical QSAR modelling workflow.
>
> QsarDB handles statistical models in the PMML data format. While
> Rajarshi suggested to use the Weka toolkit for loading and storing
> PMML models, our group decided to develop a new light-weight Java PMML
> library called JPMML (http://jpmml.googlecode.com) for this purpose.
> At the moment JPMML can do linear regression, decision tree and neural
> network models.
>
> Given the QsarDB, JPMML and CDK libraries, it should be pretty
> straightforward to write a command-line application that does exactly
> what you describe. The application would take the input SMILES and the
> list of executable QDB files as its arguments. The calculation of CDK
> descriptors can be performed locally or they can be fetched from a
> remote REST service. As a bonus, it will be possible to quantify the
> goodness of every prediction.
>
> Please let me know if you're interested in exploring the possiblities
> of QsarDB in more detail.
>
>
> Best regards,
> VR
------------------------------------------------------------------------------
Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex
infrastructure or vast IT resources to deliver seamless, secure access to
virtual desktops. With this all-in-one solution, easily deploy virtual
desktops for less than the cost of PCs and save 60% on VDI infrastructure
costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
_______________________________________________
Blueobelisk-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss