Re: [Rdkit-discuss] Windows 10 conda build fails

2017-03-06 Thread Greg Landrum
Hi Bob,

On Tue, 7 Mar 2017 at 02:21, Bob Funchess  wrote:

> I am trying to build the latest version of RDKit with Anaconda3-2.5.0 and
> it’s failing with the following error:
>

I will try to look into the specific problem over h next few days, but a
quick one: do you need to do the conda build yourself? If you aren't
modifying the RDKit then you can probably use one of the pre-built conda
packages


> PS the latest version of Anaconda has an more recent version of Python,
> which causes the boost build to fail utterly.
>

Yes, the new version includes Python 3.6, which requires a new boost build
and a couple of tweaks to the RDKit itself. We will do those before the
next release (in the next month, hopefully).

-greg



>
>
>
> --
>
> Bob Funchess, Ph.D.
> Kelaroo, Inc
>
> Director of Software Support & Development
> www.kelaroo.com
>
> bfunch...@kelaroo.com (858)
> 259-7561 x3
>
>
>
>
>
> --
> Announcing the Oxford Dictionaries API! The API offers world-renowned
> dictionary content that is easy and intuitive to access. Sign up for an
> account today to start using our lexical data to power your apps and
> projects. Get started today and enter our developer competition.
> http://sdm.link/oxford___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Windows 10 conda build fails

2017-03-06 Thread Bob Funchess
Hi All,



I am trying to build the latest version of RDKit with Anaconda3-2.5.0 and
it’s failing with the following error:



   Creating library pyAvalonTools.lib and object pyAvalonTools.exp

pyAvalonTools.cpp.obj : error LNK2019: unresolved external symbol "class
ExplicitBitVect const volatile * __cdecl boost::get_pointer(class ExplicitBitVect const volatile *)"
(??$get_pointer@$$CDVExplicitBitVect@@@boost@@YAPEDVExplicitBitVect@@PEDV1@@Z)
referenced in function "public: static struct _object * __cdecl
boost::python::objects::make_instance_impl,class ExplicitBitVect>,struct
boost::python::objects::make_ptr_instance,class ExplicitBitVect> > >::execute >(class std::auto_ptr &)" (??$execute@V?$auto_ptr@VExplicitBitVect@@@std@
@@?$make_instance_impl@VExplicitBitVect@@U?$pointer_holder@V
?$auto_ptr@VExplicitBitVect@@@std@@VExplicitBitVect@@@objects@python@boost@
@U?$make_ptr_instance@VExplicitBitVect@@U?$pointer_holder@V
?$auto_ptr@VExplicitBitVect@@@std@@VExplicitBitVect@@@objects@python@boost@
@@345@@objects@python@boost@@SAPEAU_object@@AEAV?$auto_ptr@VExplicitBitVect
@@@std@@@Z)

..\..\..\rdkit\Avalon\pyAvalonTools.pyd : fatal error LNK1120: 1 unresolved
externals

LINK failed. with 1120

NMAKE : fatal error U1077:
'K:\Anaconda3\conda-bld\rdkit_1488846410627\_b_env\Library\bin\cmake.exe' :
return code '0x'

Stop.

NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio
14.0\VC\BIN\amd64\nmake.exe"' : return code '0x2'

Stop.

NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio
14.0\VC\BIN\amd64\nmake.exe"' : return code '0x2'

Stop.



Has anyone else seen this? It looks like the problem is with boost, but
“conda build boost” APPEARED to work okay.



Thanks,

Bob



PS the latest version of Anaconda has an more recent version of Python,
which causes the boost build to fail utterly.





--

Bob Funchess, Ph.D.Kelaroo,
Inc

Director of Software Support & Development
www.kelaroo.com

bfunch...@kelaroo.com (858)
259-7561 x3
--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] delete a substructure

2017-03-06 Thread Chenyang Shi
Hongbin and Greg,
Thank you both for kind suggestions. I will try both approaches and report
my progress later.
Best,
Chenyang

On Monday, March 6, 2017, Greg Landrum  wrote:

> The solution that Hongbin proposes to the double-counting problem is a
> good one. Just be sure to sort your substructure queries in the right order
> so that the more complex ones come first.
>
> Another thing you might think about is making your queries more specific.
> For example, as you pointed out "[OH]" is very general and matches parts of
> carboxylic acids and a number of other functional groups. The RDKit has a
> set of fairly well tested (though certainly not perfect) functional group
> definitions in $RDBASE/Data/Functional_Group_Hierarchy.txt. The alcohol
> definition from there looks like this:
> [O;H1;$(O-!@[#6;!$(C=!@[O,N,S])])]
>
>
> -greg
>
>
> On Mon, Mar 6, 2017 at 7:20 AM, 杨弘宾  > wrote:
>
>> Hi, Chenyang,
>> You don't need to delete the substructure from the molecule. Just
>> check whehter the mapped atoms have been matched. For example:
>>
>> m = Chem.MolFromSmiles('CC(=O)O')
>> OH = Chem.MolFromSmarts('[OH]')
>> COOH = Chem.MolFromSmarts('C(O)=O')
>>
>> m.GetSubstructMatches(OH)
>> >> ((3,),)
>> m.GetSubstructMatchs(COOH)
>> >> ((1, 3, 2),)
>>
>> Since atom "3" has been already matched, it should be ignored.
>> So you can create a "set" to record the matched atoms to avoid
>> repetitive count.
>>
>> --
>> Hongbin Yang 杨弘宾
>>
>>
>> *From:* Chenyang Shi
>> 
>> *Date:* 2017-03-06 14:04
>> *To:* Greg Landrum
>> 
>> *CC:* RDKit Discuss
>> 
>> *Subject:* Re: [Rdkit-discuss] delete a substructure
>> Hi Greg,
>>
>> Thanks for a prompt reply. I did try "GetSubstructMatches()" and it
>> returns correct numbers of substructures for CH3COOH. The potential problem
>> with this approach is that if the molecule is getting complicated, it will
>> possibly generate duplicate numbers for certain functional groups. For
>> example, --OH (alcohol) group will be likely also counted in --COOH. A
>> safer way, in my mind, is to remove the substructure that has been counted.
>>
>> Greg, you mentioned "chemical reaction functionality", can you show me a
>> demo script with that using CH3COOH as an example. I will definitely delve
>> into the manual to learn more. But reading your code will be a good start.
>>
>> Thanks,
>> Chenyang
>>
>>
>>
>> On Sun, Mar 5, 2017 at 10:15 PM, Greg Landrum > > wrote:
>>
>>> Hi Chenyang,
>>>
>>> If you're really interested in counting the number of times the
>>> substructure appears, you can do that much quicker with
>>> `GetSubstructMatches()`:
>>>
>>> In [2]: m = Chem.MolFromSmiles('CC(C)CCO')
>>> In [3]: len(m.GetSubstructMatches(Chem.MolFromSmarts('[CH3;X4]')))
>>> Out[3]: 2
>>>
>>> Is that sufficient, or do you actually want to sequentially remove all
>>> of the groups in your list?
>>>
>>> If you actually want to remove them, you are probably better off using
>>> the chemical reaction functionality instead of DeleteSubstructs(), which
>>> recalculates the number of implicit Hs on atoms after each call.
>>>
>>> -greg
>>>
>>>
>>> On Mon, Mar 6, 2017 at 4:21 AM, Chenyang Shi >> > wrote:
>>>
 I am new to rdkit but I am already impressed by its vibrant community.
 I have a question regarding deleting substructure. In the RDKIT
 documentation, this is a snippet of code describing how to delete
 substructure:

 >>>m = Chem.MolFromSmiles("CC(=O)O")
 >>>patt = Chem.MolFromSmarts("C(=O)[OH]")
 >>>rm = AllChem.DeleteSubstructs(m, patt)
 >>>Chem.MolToSmiles(rm)
 'C'

 This block of code first loads a molecule CH3COOH using SMILES code,
 then defines a substructure COOH using SMARTS code which is to be deleted.
 After final line of code, the program outputs 'C', in SMILES form.

 I had wanted to develop a method for detecting number of groups in a
 molecule. In CH3COOH case, I can search number of --CH3 and --COOH group by
 using their respective SMARTS code with no problem. However, when molecule
 becomes more complicated, it is preferred to delete the substructure that
 has been searched before moving to next search using SMARTS code. Well, in
 current case, after searching -COOH group and deleting it, the leftover is
 'C' which is essentially CH4 instead of --CH3. I cannot proceed with
 searching with SMARTS code for --CH3 ([CH3;A;X4!R]).

 Is there any way to work around this?
 Thanks,
 Chenyang