Revision: 20768 http://sourceforge.net/p/jmol/code/20768 Author: hansonr Date: 2015-09-10 12:24:51 +0000 (Thu, 10 Sep 2015) Log Message: ----------- Jmol.___JmolVersion="14.3.16_2015.09.09"
new feature: SMILES/SMARTS atom designations [C(xxxx)] -- allows pointing to the same atom without connection numbers -- (xxxx) may be anything, including just () -- definition may be anywhere in bracketed atom specification -- any additional primitives in referring expression will be ignored -- rationale: One basic aspect of SMILES is that it efficiently uses numbers to indicate connectivity using a process of "opening" bonds and "closing" them. Along with radical (.) notation, this is totally sufficient for describing any connected network of atoms, including situations where the ordering of connections is critical (e.g., in describing stereochemistry). Basically, one can simply list all the atoms in an arbitrary order as single-atom components (separated by "."), then assign bonds as desired, in whatever order is desired. The problem comes when attempting to indicate stereochemistry for centers with more than six substituents or with geometries that are outside the standard set of AL, TH, TP, SP, and OH. In such cases, there may arise situations where the ordering of unbonded substituents will be critical. An example is crystal structures of metals and metal alloys. In this situation there are no covalent bonds. The need is to be able to compare two such crystal structures. The solution is to create SMARTS and SMILES strings for complex "atomic environments" consisting of a central atom and any number of nearby atoms, using a cutoff radius rather than a bonding pattern. One possibility is to create a "pseudobond" that connects the central atom to all of its connected atoms, but this is not really necessary and slows processing significantly. Instead, the [@PHn] syntax proposed here allows the polyhedral environment to be specified only for the polyhedron itself, exclusive of the central atom. By providing a means of referring to a specific previously defined atom in a SMILES or SMARTS string, we can allow an atom in such an arrangement to be part of two different polyhedra. Thus, although isolated polySMARTS can be implemented without atom referents, extending that to more complex multi-atom polySMARTS searches requires them. -- example, simple branched organic $ load $t-butylmethylether $ select on search("[O][C(a)H0].[C(a)]C") 5 atoms selected same as $ select on search("[O][CH0](C)") -- note that [C(2)] and [O(2)] are not sensible and may be disallowed. code: javajs.util reconciled with swingjs project Modified Paths: -------------- trunk/Jmol/src/org/jmol/shapespecial/Polyhedra.java trunk/Jmol/src/org/jmol/smiles/SmilesAtom.java trunk/Jmol/src/org/jmol/smiles/SmilesMatcher.java trunk/Jmol/src/org/jmol/smiles/SmilesParser.java trunk/Jmol/src/org/jmol/smiles/SmilesSearch.java trunk/Jmol/src/org/jmol/viewer/Jmol.properties Modified: trunk/Jmol/src/org/jmol/shapespecial/Polyhedra.java =================================================================== --- trunk/Jmol/src/org/jmol/shapespecial/Polyhedra.java 2015-09-09 12:52:37 UTC (rev 20767) +++ trunk/Jmol/src/org/jmol/shapespecial/Polyhedra.java 2015-09-10 12:24:51 UTC (rev 20768) @@ -360,7 +360,7 @@ bs.set(polyhedrons[i].centralAtom.i); } else if (sm != null) { polyhedrons[i].getSymmetry(vwr, false); - String smiles0 = polyhedrons[i].smiles; + String smiles0 = polyhedrons[i].polySmiles; try { if (sm.areEqual(smiles, smiles0) > 0) bs.set(polyhedrons[i].centralAtom.i); Modified: trunk/Jmol/src/org/jmol/smiles/SmilesAtom.java =================================================================== --- trunk/Jmol/src/org/jmol/smiles/SmilesAtom.java 2015-09-09 12:52:37 UTC (rev 20767) +++ trunk/Jmol/src/org/jmol/smiles/SmilesAtom.java 2015-09-10 12:24:51 UTC (rev 20768) @@ -61,6 +61,7 @@ int index; String atomName; + String referance; String residueName; String residueChar; boolean isBioAtom; @@ -95,6 +96,11 @@ private int charge = Integer.MIN_VALUE; private int matchingIndex = -1; SmilesStereo stereo; + + public int getChiralClass() { + return (stereo == null ? 0 : stereo.chiralClass); + } + private boolean isAromatic; public boolean isDefined() { @@ -128,35 +134,29 @@ this.bonds = bonds; } - public SmilesAtom addAtomOr() { + public SmilesAtom appendAtomOr(SmilesAtom sAtom) { if (atomsOr == null) atomsOr = new SmilesAtom[2]; if (nAtomsOr >= atomsOr.length) atomsOr = (SmilesAtom[]) AU.doubleLength(atomsOr); - SmilesAtom sAtom = new SmilesAtom().setIndex(index); + sAtom.setIndex(index); sAtom.parent = this; - atomsOr[nAtomsOr] = sAtom; - nAtomsOr++; + atomsOr[nAtomsOr++] = sAtom; return sAtom; } - public SmilesAtom addPrimitive() { + public SmilesAtom appendPrimitive(SmilesAtom sAtom) { if (primitives == null) primitives = new SmilesAtom[2]; - if (nPrimitives >= primitives.length) { - SmilesAtom[] tmp = new SmilesAtom[primitives.length * 2]; - System.arraycopy(primitives, 0, tmp, 0, primitives.length); - primitives = tmp; - } - SmilesAtom sAtom = new SmilesAtom().setIndex(index); + if (nPrimitives >= primitives.length) + primitives = (SmilesAtom[]) AU.doubleLength(primitives); + sAtom.setIndex(index); sAtom.parent = this; - primitives[nPrimitives] = sAtom; + primitives[nPrimitives++] = sAtom; setSymbol("*"); hasSymbol = false; - nPrimitives++; return sAtom; } - /** * Constructs a <code>SmilesAtom</code>. * @@ -806,8 +806,4 @@ + "]"; } - public int getChiralClass() { - return (stereo == null ? 0 : stereo.chiralClass); - } - } Modified: trunk/Jmol/src/org/jmol/smiles/SmilesMatcher.java =================================================================== --- trunk/Jmol/src/org/jmol/smiles/SmilesMatcher.java 2015-09-09 12:52:37 UTC (rev 20767) +++ trunk/Jmol/src/org/jmol/smiles/SmilesMatcher.java 2015-09-10 12:24:51 UTC (rev 20768) @@ -414,7 +414,7 @@ if (!JC.checkFlag(flags, JC.SMILES_NOSTEREO)) { s = "//* " + center + " *//\t[" + Elements.elementSymbolFromNumber(center.getElementNumber()) + "@PH" - + atomCount + (details == null ? "" : "/" + details + "/") + "]" + s; + + atomCount + (details == null ? "" : "/" + details + "/") + "]." + s; } return s; } Modified: trunk/Jmol/src/org/jmol/smiles/SmilesParser.java =================================================================== --- trunk/Jmol/src/org/jmol/smiles/SmilesParser.java 2015-09-09 12:52:37 UTC (rev 20767) +++ trunk/Jmol/src/org/jmol/smiles/SmilesParser.java 2015-09-10 12:24:51 UTC (rev 20768) @@ -793,14 +793,12 @@ if (pattern == null || pattern.length() == 0) throw new InvalidSmilesException("Empty atom definition"); - SmilesAtom newAtom = (atomSet == null ? molecule.addAtom() - : isPrimitive ? atomSet.addPrimitive() : atomSet.addAtomOr()); - if (braceCount > 0) - newAtom.selected = true; - + SmilesAtom newAtom = new SmilesAtom(); + if (atomSet == null) + molecule.appendAtom(newAtom); + boolean isNewAtom = true; if (!checkLogic(molecule, pattern, newAtom, null, currentAtom, isPrimitive, isBranchAtom)) { - int[] ret = new int[1]; if (isBioSequence && pattern.length() == 1) @@ -816,7 +814,6 @@ newAtom.not = isNot = true; } - int hydrogenCount = Integer.MIN_VALUE; int biopt = pattern.indexOf('.'); int chiralpt = pattern.indexOf('@'); if (biopt >= 0 && (chiralpt < 0 || biopt < chiralpt)) { @@ -849,7 +846,8 @@ ch = '\0'; } newAtom.setBioAtom(bioType); - while (ch != '\0') { + int hydrogenCount = Integer.MIN_VALUE; + while (ch != '\0' && isNewAtom) { newAtom.setAtomName(isBioSequence ? "\0" : ""); if (PT.isDigit(ch)) { index = getDigits(pattern, index, ret); @@ -892,6 +890,18 @@ else newAtom.elementNumber = ret[0]; break; + case '(': + // JmolSMARTS, JmolSMILES reference to atom + String name = getSubPattern(pattern, index, '('); + index += 2 + name.length(); + newAtom = checkReference(newAtom, name, ret); + isNewAtom = (ret[0] == 1); // we are done here + if (!isNewAtom) { + if (isNot) + index = 0; // triggers an error + isNot = true; // flags that this must be the end + } + break; case '-': case '+': index = checkCharge(pattern, index, newAtom); @@ -899,7 +909,8 @@ case '@': if (molecule.stereo == null) molecule.stereo = SmilesStereo.newStereo(null); - index = SmilesStereo.checkChirality(pattern, index, molecule.patternAtoms[newAtom.index]); + index = SmilesStereo.checkChirality(pattern, index, + molecule.patternAtoms[newAtom.index]); break; default: // SMARTS has ambiguities in terms of chaining without &. @@ -1044,6 +1055,16 @@ molecule.patternAtoms[newAtom.index] .setExplicitHydrogenCount(hydrogenCount); } + if (braceCount > 0) + newAtom.selected = true; + if (isNewAtom) { + if (atomSet != null) { + if (isPrimitive) + atomSet.appendPrimitive(newAtom); + else + atomSet.appendAtomOr(newAtom); + } + } // Final check @@ -1068,7 +1089,31 @@ return newAtom; } + private Map<String, SmilesAtom> atomRefs; + /** + * allow for [(...)] to indicate a specific pattern atom + * + * @param newAtom + * @param name + * @param ret set [0] to 1 for new atom; 0 otherwise + * @return new or old atom + */ + private SmilesAtom checkReference(SmilesAtom newAtom, String name, int[] ret) { + if (atomRefs == null) + atomRefs = new Hashtable<String, SmilesAtom>(); + SmilesAtom ref = atomRefs.get(name); + if (ref == null) { + // this is a new atom + atomRefs.put(newAtom.referance = name, ref = newAtom); + ret[0] = 1; + } else { + ret[0] = 0; + } + return ref; + } + + /** * Parses a ring definition * * @param molecule Modified: trunk/Jmol/src/org/jmol/smiles/SmilesSearch.java =================================================================== --- trunk/Jmol/src/org/jmol/smiles/SmilesSearch.java 2015-09-09 12:52:37 UTC (rev 20767) +++ trunk/Jmol/src/org/jmol/smiles/SmilesSearch.java 2015-09-10 12:24:51 UTC (rev 20768) @@ -139,12 +139,13 @@ } SmilesAtom addAtom() { + return appendAtom(new SmilesAtom()); + } + + SmilesAtom appendAtom(SmilesAtom sAtom) { if (ac >= patternAtoms.length) patternAtoms = (SmilesAtom[]) AU.doubleLength(patternAtoms); - SmilesAtom sAtom = new SmilesAtom().setIndex(ac); - patternAtoms[ac] = sAtom; - ac++; - return sAtom; + return patternAtoms[ac] = sAtom.setIndex(ac++); } int addNested(String pattern) { Modified: trunk/Jmol/src/org/jmol/viewer/Jmol.properties =================================================================== --- trunk/Jmol/src/org/jmol/viewer/Jmol.properties 2015-09-09 12:52:37 UTC (rev 20767) +++ trunk/Jmol/src/org/jmol/viewer/Jmol.properties 2015-09-10 12:24:51 UTC (rev 20768) @@ -58,26 +58,88 @@ TODO: image off stops JSmol -TODO: Although isolated polyhedra are working in SMARTS searching, - for a SMILES representation, it will be important to allow a given atom - to be in more than one polyhedron (edge- or face-connected polyhedra). - But then that will require to references to the same atom. The solution - is probabably something like [(3)], indicating that this atom is equated with - another in the SMILES string: +Jmol.___JmolVersion="14.3.16_2015.09.09" + +new feature: SMILES/SMARTS atom designations [C(xxxx)] + -- allows pointing to the same atom without connection numbers + -- (xxxx) may be anything, including just () + -- definition may be anywhere in bracketed atom specification + -- any additional primitives in referring expression will be ignored + -- rationale: + + One basic aspect of SMILES is that it efficiently uses numbers to + indicate connectivity using a process of "opening" bonds and "closing" them. + Along with radical (.) notation, this is totally sufficient for describing + any connected network of atoms, including situations where the ordering + of connections is critical (e.g., in describing stereochemistry). Basically, + one can simply list all the atoms in an arbitrary order as single-atom + components (separated by "."), then assign bonds as desired, in whatever + order is desired. - [Fe][O(1)].[Fe][O(1)] + The problem comes when attempting to indicate stereochemistry for + centers with more than six substituents or with geometries that are + outside the standard set of AL, TH, TP, SP, and OH. In such cases, + there may arise situations where the ordering of unbonded substituents + will be critical. An example is crystal structures of metals and metal + alloys. In this situation there are no covalent bonds. The need is to + be able to compare two such crystal structures. - then bond references to that oxygen from some other source would be to both? + The solution is to create SMARTS and SMILES strings for complex + "atomic environments" consisting of a central atom and any number of + nearby atoms, using a cutoff radius rather than a bonding pattern. + One possibility is to create a "pseudobond" that connects the central + atom to all of its connected atoms, but this is not really necessary and + slows processing significantly. Instead, the [@PHn] syntax proposed here + allows the polyhedral environment to be specified only for the polyhedron + itself, exclusive of the central atom. + + By providing a means of referring to a specific previously defined atom in a + SMILES or SMARTS string, we can allow an atom in such an arrangement + to be part of two different polyhedra. + + Thus, although isolated polySMARTS can be implemented without atom referents, + extending that to more complex multi-atom polySMARTS searches requires them. + + -- example, simple branched organic - ....C(2) ....C(4) ??? + $ load $t-butylmethylether + $ select on search("[O][C(a)H0].[C(a)]C") + + 5 atoms selected + + same as -Jmol.___JmolVersion="14.3.16_2015.09.09" + $ select on search("[O][CH0](C)") + + -- note that [C(2)] and [O(2)] are not sensible and may be disallowed. + code: javajs.util reconciled with swingjs project -bug fix: polyhedra.stereoSmiles --> polyhedra.polySmiles; central atom added in +bug fix: polyhedra.stereoSmiles --> polyhedra.polySmiles +new feature: polyhedron.polySmiles adds central atom + -- example: + + load SF6.smol -1 + polyhedra + calculate symmetry polyhedra + x = {polyhedra}.polyhedra.polySmiles + print x + + //* S1 #1 *// [S@PH6]. + //* F6 #7 *// [F]1234. + //* F2 #3 *// [F]5672. + //* F3 #4 *// [F]849%10. + //* F4 #5 *// [F]%11%10%126. + //* F1 #2 *// [F]937%12. + //* F5 #6 *// [F]8%1151 + + print polyhedron(x).atomname + + S1 + JmolVersion="14.3.16_2015.09.08" code: Efficient JSON parser javajs.util.JSONParser This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. ------------------------------------------------------------------------------ Monitor Your Dynamic Infrastructure at Any Scale With Datadog! Get real-time metrics from all of your servers, apps and tools in one place. SourceForge users - Click here to start your Free Trial of Datadog now! http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140 _______________________________________________ Jmol-commits mailing list Jmol-commits@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jmol-commits