On 25 September 2013 16:14, Joos Kiener <j...@sunrise.ch> wrote:

> It all depends on the actually use case. Yes, creating AtomContainer is
> more costly than loading molfiles but if you often load thousands of
> molfiles then that performance becomes relevant too,
>


Yes, depends on an use case. My point is I can hardly think of an use case
one needs thousands of IAtomContainers in memory simultaneously; all
calculation workflows usually can be done one molecule at a time, and for a
modelling exercise one needs a matrix of numbers, not IAtomContainers.
 Happy to be proved wrong ;)


This OrChem format solves both issues and will drastically increase
> performance for any use case. So I think it's a good idea to use it.
>
> ORM or JDBC Cartridge or Not is a whole other discussion and in the end
> everything has it's advantages and downsides and the actual use case
> decides whats best. As I see it ambit targets a completely different use
> case than my project so there is no need for being hostile towards it? (or
> maybe I just misinterpreted your reply).
>

no intent to be hostile, sharing the experience of a working system.

Nina


> In the end I was only trying to help.
>
> Best Regards,
>
> Joos
>
>
> 2013/9/25 Nina Jeliazkova <jeliazkova.n...@gmail.com>
>
>>
>>
>>
>> On 25 September 2013 08:03, Joos Kiener <j...@sunrise.ch> wrote:
>>
>>> Since I have played around with this for a fairly long time, here some
>>> of my observations:
>>>
>>> - loading lots (thousands) of molfiles from relation databases is quiet
>>> slow
>>>
>>
>>
>> The slow part is not reading molfile from the database, but loading it
>> into IAtomContainer. Actually it is rarely needed to load bunch of atom
>> containers in memory - especially in web based interface where
>> serialisation is to something web friendly, not Java objects.
>>
>> http://apps.ideaconsult.net:8080/ambit2/dataset?page=0&pagesize=100
>>
>> And of course check http://ambit.sf.net for full featured MySQL
>> structure searchable database + properties (no cartridge , no memory hog
>> ORM,  just JDBC) with REST web service API ( i.e. OpenTox API).
>>
>> Best regards,
>> Nina
>>
>>
>>>  - converting and full configuring Atomcontainers from molfiles is a
>>> very expensive operation
>>> - AtomContainers use a lot of memory so it must be tightly controlled
>>> how many are in memory and hence the point before comes into play again
>>>
>>> This problem was also observed by the creators of OrChem, the Oracle
>>> cartridge based on CDK. And hence they created a custom serialization
>>> method that take less space than molfiles and stores configuration info of
>>> atomcontainers:
>>>
>>>
>>> http://orchem.cvs.sourceforge.net/viewvc/orchem/OrChem/src/uk/ac/ebi/orchem/search/OrchemMoleculeBuilder.java?view=markup
>>>
>>> This is way faster (at least 10) than using the molfiles.
>>>
>>>
>>> Also when talking about storing chemical structures in a database I can
>>> gladly refer you below project of mine:
>>>
>>> https://bitbucket.org/kienerj/moleculedatabaseframework
>>>
>>> Best Regards,
>>>
>>> Joos
>>>
>>>
>>> 2013/9/24 lochana menikarachchi <locha...@yahoo.com>
>>>
>>>> What is the recommended method for storing IAtomContainers in a
>>>> database. Serialize? MDL Strings?
>>>> Is there anyway to get the MDLV2000 representation as a String from
>>>> IAtomContainer??
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> October Webinars: Code for Performance
>>>> Free Intel webinars can help you accelerate application performance.
>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the
>>>> most from
>>>> the latest Intel processors and coprocessors. See abstracts and
>>>> register >
>>>>
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>>>> _______________________________________________
>>>> Cdk-user mailing list
>>>> Cdk-user@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> October Webinars: Code for Performance
>>> Free Intel webinars can help you accelerate application performance.
>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
>>> from
>>> the latest Intel processors and coprocessors. See abstracts and register
>>> >
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Cdk-user mailing list
>>> Cdk-user@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/cdk-user
>>>
>>>
>>
>
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user

Reply via email to