Tim,

Thanks for your reply. Yes, we have the canonical SMILES strings stored as
properties in our glass.sdf file. I tried to generate canonical SMILES
as the result, and they are different than ours. Thus, ours were probably
acquired using a different canonicalization. This then leads to another
question that has come to me. Does the input for substructure or similarity
searching have to be in SDF format or can it be another format, such as a
list of InChI ID's? In other words, does the fast search index have to come
from an SDF file? Many thanks.


Wallace


On Tue, Jul 22, 2014 at 7:44 PM, Tim Vandermeersch <
tim.vandermeer...@gmail.com> wrote:

> Hi,
>
> I assume you have canonical SMILES strings in glass.sdf stored as titles
> or properties. Correct me if this is incorrect. If so, it depends on what
> program was used to create these canonical SMILES strings. If you used
> openbabel for this, you can convert the molecules in result.smi to
> openbabel canonical SMILES (or write canonical SMILES directly using the
> .can extension).
>
> In the case where another program was used to generate the canonical
> SMILES, it would not be possible to use openbabel to generate the same
> canonical SMILES starting from result.smi. If you have access to the other
> program you could use this to convert results.smi to these canonical SMILES
> and use these to search glass.sdf.
>
> The reason for this is that there is no universal SMILES canonicalization
> algorithm. Different toolkits will result in different canonical SMILES
> (which are canonical only when using the same toolkit). InChI on the hand
> has a single reference implementation.
>
> Tim
>
>
> On Wed, Jul 23, 2014 at 12:03 AM, Wallace Chan <walla...@umich.edu> wrote:
>
>> Dr. Hutchison,
>>
>> Yes, this helps. I do have another question about substructure searching.
>> We are building a database with roughly 270,000 molecules and want users to
>> be able to do a substructure and similarity search. I've read the following
>> documentation,
>> http://openbabel.org/docs/dev/Fingerprints/fingerprints.html, and it
>> helps in understand how this process works. However, I want to ask whether
>> or not the output file from the query can contain the exact same SMILES
>> strings that were generated from the fast search index. Currently, the
>> SMILES strings generated from the query in the result.smi file are not the
>> canonical SMILES that I used to create the fast search index. For example,
>> if I were to look for a benzene substructure with the following command,
>>
>> *babel glass.fs -ifs -sc1ccccc1 result.smi*
>>
>> would I be able to retrieve the SMILES string from glass.sdf, which was
>> used to create glass.fs? Many thanks for your patience.
>>
>>
>> Wallace
>>
>>
>>
>> On Tue, Jul 22, 2014 at 2:52 PM, Geoffrey Hutchison <
>> geoff.hutchi...@gmail.com> wrote:
>>
>>> A valid SMILES string is generally a SMARTS. It might not be the best
>>> possible SMARTS, but we do have a set of tests for SMILES to match
>>> themselves as SMARTS.
>>>
>>> Hope that helps,
>>> -Geoff
>>
>>
>>
>>
>> --
>> Wallace Chan
>> PhD Candidate
>> Zhang Lab
>> Department of Biological Chemistry
>> University of Michigan
>> walla...@umich.edu
>>
>>
>> ------------------------------------------------------------------------------
>> Want fast and easy access to all the code in your enterprise? Index and
>> search up to 200,000 lines of code with a free copy of Black Duck
>> Code Sight - the same software that powers the world's largest code
>> search on Ohloh, the Black Duck Open Hub! Try it now.
>> http://p.sf.net/sfu/bds
>> _______________________________________________
>> OpenBabel-discuss mailing list
>> OpenBabel-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>>
>>
>


-- 
Wallace Chan
PhD Candidate
Zhang Lab
Department of Biological Chemistry
University of Michigan
walla...@umich.edu
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

Reply via email to