Hi Andrew, On Wed, Jun 1, 2011 at 8:45 PM, Andrew Dalke <da...@dalkescientific.com> wrote: > Also, what does CDK do if there's an error in one of the records? For > example, some of the other toolkits' readers skip bad records and report to > stderr. If CDK ignores bad records, is there a way to get the count of > records skipped? If CDK raises an exception, is there a way to ignore the > error and keep on processing?
This is what the STRICT and RELAXED mode is about. The first will throw an exception, the later will only report it, using a listener. The 'Groovy Cheminformatics' book has a section on this. Have a look at this IChemObjectReader method (which the readers inherit): public void setErrorHandler(IChemObjectReaderErrorHandler handler) Note that this only looks at the file format, not at the chemical errors. >> the CDK-style guidelines prefer longer variable names (though this is really >> more me being obsessive), so the "IAtom a" would look better as, for >> example, "IAtom atom" or even "IAtom currentAtom" > > Being the type of person I am, I checked the actual practice ;) > > 95 matches to "IAtom a[^0-9a-zA-Z]" (56 in *Test.java files) > 2158 matches to "IAtom atom[^a-zA-Z_]" (870 in *Test.java files) (eg, "atom1") > 235 matches to "IAtom atom[a-zA-Z]" (61 in tests) (eg, "atomToUpdate") This is to encourage the developers to think about decent names. atomToUpdate and atomToTakeInfoFrom is much more informative (though on the lengthy side) than atom1 and atom2. If the code is clear enough written (which is not always the case, e.g. because the algorithm is just long and hard to split up into methods) then atom1 and atom2 can do. 'a' is really quite bad in some situations, where you have both an atom type, an atom, and an atom container... a short variable name puts extra stress on the reader. Now, sometimes 'a' is perhaps acceptable, such as really simple iterations... like the for (IAtom a : mol.atoms()) { print a.getSymbol(); } but with a mere three chars more, I think it become all the much more readable: for (IAtom atom : mol.atoms()) { print atom.getSymbol(); } Code must be readable. The risk of introducing misunderstanding is just to big. Just browse a random CDK class, and try to figure out what the algorithm is doing, and explain that to the first person sitting next to you. If that actually worked, the source code is OK. The minimal three char variable length checking code by the CDK code quality checker is just a very lousy thing we can do automatically... the least it does is make people think about code quality. Egon -- Dr E.L. Willighagen Postdoctoral Researcher Institutet för miljömedicin Karolinska Institutet (http://ki.se/imm) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers ------------------------------------------------------------------------------ Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Data protection magic? Nope - It's vRanger. Get your free trial download today. http://p.sf.net/sfu/quest-sfdev2dev _______________________________________________ Cdk-user mailing list Cdk-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cdk-user