Chenyang,
I haven't looked at your smarts strings yet, but I do have this list of
SMARTS strings for the joback method I compiled myself (for use here:
https://www.wolframalpha.com/input/?i=2,3-methano-5,6-dichloroindene&lk=3
).

Perhaps this can be of use.  If you spot any mistakes, please let me know

Jason

$JobackSubstructures={

{"Methyl","-CH3", "[CX4H3]"},

{"SecondaryAcyclic", "-CH2-", "[!R;CX4H2]"},

{"TertiaryAcyclic",">CH-", "[!R;CX4H]"},

{"QuaternaryAcyclic", ">C<", "[!R;CX4H0]"},

{"PrimaryAlkene", "=CH2", "[CX3H2]"},

{"SecondaryAlkeneAcyclic", "=CH-", "[!R;CX3H1;!$([CX3H1](=O))]"},

{"TertiaryAlkeneAcyclic", "=C<", "[$([!R;#6X3H0]);!$([!R;#6X3H0]=[#8])]"},

{"CumulativeAlkene", "=C=", "[$([CX2H0](=*)=*)]"},

{"TerminalAlkyne", "\[Congruent]CH","[$([CX2H1]#[!#7])]"},

{"InternalAlkyne","\[Congruent]C-","[$([CX2H0]#[!#7])]"},

{"SecondaryCyclic", "-CH2- (ring)", "[R;CX4H2]"},

{"TertiaryCyclic", ">CH- (ring)", "[R;CX4H]"},

{"QuaternaryCyclic", ">C< (ring)", "[R;CX4H0]"},

{"SecondaryAlkeneCyclic", "=CH- (ring)", "[R;CX3H1,cX3H1]"},

{"TertiaryAlkeneCyclic", "=C<
(ring)","[$([R;#6X3H0]);!$([R;#6X3H0]=[#8])]"},

{"Fluoro", "-F", "[F]"},

{"Chloro", "-Cl", "[Cl]"},

{"Bromo", "-Br", "[Br]"},

{"Iodo", "-I", "[I]"},

{"Alcohol","-OH", "[OX2H;!$([OX2H]-[#6]=[O]);!$([OX2H]-a)]"},(* alcohol -
not matching a carboxylic acid *)

{"Phenol","-OH", "[$([OX2H]-a)]"},

{"EtherAcyclic", "-O-", "[OX2H0;!R;!$([OX2H0]-[#6]=[#8])]"},

{"EtherCyclic", "-O- (ring)", "[#8X2H0;R;!$([#8X2H0]~[#6]=[#8])]"},

{"CarbonylAcyclic", ">C=O",
"[$([CX3H0](=[OX1]));!$([CX3](=[OX1])-[OX2]);!R]=O"},

{"CarbonylCyclic", ">C=O
(ring)","[$([#6X3H0](=[OX1]));!$([#6X3](=[#8X1])~[#8X2]);R]=O"},

{"Aldehyde","O=CH-","[CX3H1](=O)"},

{"CarboxylicAcid", "COOH", "[OX2H]-[C]=O"},

{"Ester", "-C(=O)O-", "[#6X3H0;!$([#6X3H0](~O)(~O)(~O))](=[#8X1])[#8X2H0]"},

{"OxygenDoubleBondOther", "=O",
"[OX1H0;!$([OX1H0]~[#6X3]);!$([OX1H0]~[#7X3]~[#8])]"},

{"PrimaryAmino","NH2", "[NX3H2]"},

{"SecondaryAminoAcyclic",">NH", "[NX3H1;!R]"},

{"SecondaryAminoCyclic",">NH (ring)", "[#7X3H1;R]"},

{"TertiaryAmino", ">N-","[#7X3H0;!$([#7](~O)~O)]"}, (* Tertiary amine
except nitro group *)

{"ImineCyclic","=N- (ring)","[#7X2H0;R]"},

{"ImineAcyclic","=N-","[#7X2H0;!R]"},

{"Aldimine", "=NH", "[#7X2H1]"},

{"Cyano", "-C\[Congruent]N","[#6X2]#[#7X1H0]"},

{"Nitro", "NO2", "[$([#7X3,#7X3+][!#8])](=[O])~[O-]"},

{"Thiol", "-SH", "[SX2H]"},

{"ThioetherAcyclic", "-S-", "[#16X2H0;!R]"},

{"ThioetherCyclic", "-S- (ring)", "[#16X2H0;R]"}

};

Jason Biggs


On Wed, Nov 8, 2017 at 4:52 PM, Chenyang Shi <cs3...@columbia.edu> wrote:

> Hi everyone,
>
> I have been recently working on a project that implements Joback method
> using RDKit (https://en.wikipedia.org/wiki/Joback_method).
>
> I believe the core to the success of this project is to make the 41
> functional groups correctly represented by SMARTS code. I have compiled my
> own codes, see attachment. I would appreciate your review of it and let me
> know if you spot errors.
>
> I think building a robust/well-tested SMARTS database (though small in my
> case) would be helpful to others and other projects.
>
> Thank you,
> Chenyang
>
> PS: The ones highlighted red in the document are robust.
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to