Hi Yingfeng,
In short, no. I don’t think it’s easy to provide a comprehensive solution for
neutralisation. However approximations such as the RDKit SMARTS you’ve tried
offer a good approach for most cases.
What might be easier is to understand why you need to neutralise the compounds?
Anyways, I’m not a chemist but I’ll try my best to answer as to why it’s not
simple. Firstly in an InChI string you can tell if there is a charge when a
layer starts with /p or /q.
> InChI=1S/C5H9NO4/c6-3(5(9)10)1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/p-1/t3-/m0/s1
You could also check the atoms where the formal charge is not 0. I guess what
you really mean by charged is whether there is no overall charge.
C[C@@H](O)[C@H]([NH3+])C([O-])=O uncharged
C[C@@H]([O-])[C@H]([NH3+])C([O-])=O charged
Again a simple procedure of summing all the charges in a connected structure
will tell you this:
int sum = 0;
for (IAtom a : m.atoms())
sum += a.getFormalCharge();
boolean charged = sum == 0;
As for neutralising, that’s more tricky. There may be something in a dusty
corner of the CDK code but I’m not aware of it. The neutralisation of one atom
is easily made by adding/removing protons or breaking/making bonds. However
when there are multiple charges it is non-trival as it involves a decision.
Considering the example from earlier.
C[C@@H]([O-])[C@H]([NH3+])C([O-])=O charged
how do we decide which neutralised form is correct, these are both have no
overall charge:
C[C@@H](O)[C@H]([NH3+])C([O-])=O uncharged
C[C@@H]([O-])[C@H]([NH3+])C(O)=O uncharged
My guess would be the correct way would be to order the neutralisation of
charges using pKa? In which case you need a pKa predictor, again, not simple
[1].
Neutralisation reduces to finding the ionisation a given pH (i.e. find the pH
where the compound is neutral). ChemAxon offer this functionality but I have
been told of examples where given two ionisation states of the same compound
(one > desired pH, one < desired pH) the tool produces different output.
Sorry I can’t be of more help.
Thanks,
John
[1] Lee and Crippen, Predicting pKa
http://pubs.acs.org/doi/abs/10.1021/ci900209w
On 21 Dec 2013, at 14:05, Yingfeng Wang <ywang...@gmail.com> wrote:
> I have a compound with Inchi
>
> InChI=1S/C5H9NO4/c6-3(5(9)10)
> 1-2-4(7)8/h3H,1-2,6H2,(H,7,8)(H,9,10)/p-1/t3-/m0/s1
>
> First of, is there is a way to know whether it is charged?
>
> Secondly, is CDK able to neutralize it if it is charged?
>
> Thanks.
>
> Yingfeng
>
>
> ------------------------------------------------------------------------------
> Rapidly troubleshoot problems before they affect your business. Most IT
> organizations don't have a clear picture of how application performance
> affects their revenue. With AppDynamics, you get 100% visibility into your
> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk_______________________________________________
> Cdk-user mailing list
> Cdk-user@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cdk-user
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT
organizations don't have a clear picture of how application performance
affects their revenue. With AppDynamics, you get 100% visibility into your
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
Cdk-user mailing list
Cdk-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/cdk-user