Bugs item #988592, was opened at 2004-07-10 13:58
Message generated for change (Comment added) made by hansonr
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=379133&aid=988592&group_id=23629

Category: Applet
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Bob Hanson (hansonr)
Summary: cif issues

Initial Comment:
There is one more important CIF format that Jmol needs 
to be able to read. This comes from the Inorganic 
Crystal Structure Database, 

http://www.fiz-informationsdienste.de/en/DB/icsd/

For an example, see 

http://www.stolaf.edu/people/hansonr/jmol/cif/viewdir.ht
m

and select "ic1166.cif"

To download the file, check for it at 

http://www.stolaf.edu/people/hansonr/jmol/cif


Basically, this is a slight variant of the Cambridge Crystal 
Structure Database. Mostly the format of the atom 
information is different. I note that in this file there can 
be atoms designated with no coordinates! So we have 
here:

Re1 Re3+ 4 e 0.0736(1) -.00119(7) 0.07901(8) 0. 1.
Re2 Re3+ 4 e 0.4356(1) 0.06192(7) 0.45604(8) 0. 1.
Cs1 Cs1+ 4 e 0.3809(2) 0.3961(2) 0.3240(2) 0. 1.
Cs2 Cs1+ 4 e 0.1175(2) -.3131(1) -.0456(1) 0. 1.
Cl1 Cl1- 4 e 0.2775(7) -.0799(4) 0.0192(5) 0. 1.
Cl2 Cl1- 4 e -.0577(8) 0.0757(5) 0.2110(5) 0. 1.
Cl3 Cl1- 4 e -.2203(7) 0.0564(4) 0.5429(5) 0. 1.
Cl4 Cl1- 4 e 0.3159(7) -.0246(4) 0.3056(5) 0. 1.
Cl5 Cl1- 4 e 0.4982(7) 0.2012(4) 0.5660(5) 0. 1.
Cl6 Cl1- 4 e 0.1978(7) 0.1484(4) 0.0706(5) 0. 1.
Cl7 Cl1- 4 e 0.0166(7) -.1522(4) 0.1619(5) 0. 1.
Cl8 Cl1- 4 e 0.5887(7) 0.1235(5) 0.3295(5) 0. 1.
O1 O2- 4 e 0.272(4) 0.194(2) 0.338(3) 9.6 1.
H1 H1+ 4 e    0. 2.

Note the H1 H1+ line has no coordinate. A single space 
is being used, not just generic white space, to separate 
fields.


The error given is:

Java(TM) Plug-in: Version 1.4.0_01
Using JRE version 1.4.0_01 Java HotSpot(TM) Client VM
User home directory = C:\Documents and 
Settings\hansonr

Proxy Configuration: Browser Proxy Configuration




FileManager.openFile(icsd_1166.cif)

SmarterModelAdapter:The model resolver thinks:Cif

java.lang.NullPointerException

        at 
org.jmol.adapter.smarter.ModelReader.parseFloat
(ModelReader.java:45)

        at 
org.jmol.adapter.smarter.CifReader.processAtomSiteLoopB
lock(CifReader.java:288)

        at 
org.jmol.adapter.smarter.CifReader.processLoopBlock
(CifReader.java:144)

        at org.jmol.adapter.smarter.CifReader.readModel
(CifReader.java:65)

        at 
org.jmol.adapter.smarter.ModelResolver.resolveModel
(ModelResolver.java:57)

        at 
org.jmol.adapter.smarter.SmarterModelAdapter.openBuffe
redReader(SmarterModelAdapter.java:55)

        at 
org.openscience.jmol.viewer.managers.FileManager$FileO
penThread.openReader(FileManager.java:409)

        at 
org.openscience.jmol.viewer.managers.FileManager$FileO
penThread.openInputStream(FileManager.java:402)

        at 
org.openscience.jmol.viewer.managers.FileManager$FileO
penThread.run(FileManager.java:379)

        at 
org.openscience.jmol.viewer.managers.FileManager.openF
ile(FileManager.java:100)

        at 
org.openscience.jmol.viewer.JmolViewer.openFile
(JmolViewer.java:897)

        at org.openscience.jmol.viewer.script.Eval.load
(Eval.java:1566)

        at 
org.openscience.jmol.viewer.script.Eval.instructionDispatc
hLoop(Eval.java:337)

        at org.openscience.jmol.viewer.script.Eval.run
(Eval.java:281)

        at java.lang.Thread.run(Unknown Source)

error opening file:/D:/js/struc/data/csd/icsd_1166.cif
java.lang.NullPointerException

openFile(icsd_1166.cif) 210 ms

InterruptedException!



But there are more problems. The following block seems 
to be causing great difficulty:

loop_
_atom_site_aniso_label
_atom_site_aniso_type_symbol
_atom_site_aniso_U_11
_atom_site_aniso_U_22
_atom_site_aniso_U_33
_atom_site_aniso_U_12
_atom_site_aniso_U_13
_atom_site_aniso_U_23
Re1 Re3+ 0.0046(1) 0.00294(4) 0.00353(6) -.0008(1) 
0.0005(1) -.0002(1)
Re2 Re3+ 0.0049(1) 0.00264(4) 0.00456(6) 0.0005(1) 
0.0019(1) 0.0005(1)
Cs1 Cs1+ 0.0097(2) 0.00993(14) 0.0068(1) 0.0024(3) 
0.0049(3) 0.0037(2)
Cs2 Cs1+ 0.0082(2) 0.00459(9) 0.0080(1) 0.0015(3) 
0.0001(3) -.0029(2)
Cl1 Cl1- 0.0059(8) 0.0050(4) 0.0068(5) 0.0028(9) 0.001
(1) -.0014(7)
Cl2 Cl1- 0.0104(9) 0.0054(4) 0.0049(5) 0.0007(11) 0.004
(1) -.0019(7)
Cl3 Cl1- 0.0062(7) 0.0041(3) 0.0074(5) 0.0015(9) 0.004
(1) -.0006(7)
Cl4 Cl1- 0.0088(8) 0.0050(4) 0.0047(4) -.0021(10) -.001
(1) 0.0011(6)
Cl5 Cl1- 0.0083(8) 0.0033(3) 0.0070(5) 0.0007(9) 0.000
(1) -.0004(7)
Cl6 Cl1- 0.0075(8) 0.0035(3) 0.0062(5) -.0038(9) 0.000
(1) -.0002(7)
Cl7 Cl1- 0.0101(9) 0.0033(3) 0.0052(5) -.0028(9) 0.000
(1) 0.0020(6)
Cl8 Cl1- 0.0100(9) 0.0053(4) 0.0056(5) -.0039(10) 0.005
(1) 0.0029(7)


and charges on the type symbol is causing the atom 
symbol to be misinterpreted.

see 
http://www.stolaf.edu/people/hansonr/jmol/cif/ic1166b.ci
f
for what this should (probably) look like. (It's not a great 
data set.)

similarly:

http://www.stolaf.edu/people/hansonr/jmol/cif/ic30516.ci
f

and

http://www.stolaf.edu/people/hansonr/jmol/cif/ic30516b.
cif

Bob Hanson



----------------------------------------------------------------------

>Comment By: Bob Hanson (hansonr)
Date: 2004-07-14 18:51

Message:
Logged In: YES 
user_id=1082841

OK, Peter makes a very good distinction there. 

a) Format: I agree 100% that the file is invalid because of
the improper use of white space. The IUCr's own CIF checker
failed this one, so that's something I'm sure they will look
into.

b) Semantics: I suggest Jmol be somewhat more flexible in
reading atom names. At the very least, we should strip
![A-Z|a-z] from the _atom_site_type_symbol to determine the
element symbol. No need to for perfection here. Some of
these files will be very odd, since two atoms can occupy the
same position, as in   "Ni2+Fe3+"  for an atom name. 

I have only looked at a few of these IUCr CIF files; I
suspect that mostly they are just fine and that this one was
an exception. 

Bob Hanson


----------------------------------------------------------------------

Comment By: Peter Murray-Rust (petermr)
Date: 2004-07-14 06:11

Message:
Logged In: YES 
user_id=125666

_atom_type_symbol
Name
'_atom_type_symbol' 
Category: atom_type 

Data type: char 

Must appear in a looped list 
May match a value of '_atom_site_type_symbol' 

Examples:

C 
Cu2+ 
H(SDS) 
dummy 
FeNi 



<bob>
Date: 2004-07-11 20:38
Sender: nobody
Logged In: NO 

I will contact ICSD and enquire, but it is quite possible
that they are doing exactly this. IMHO, their software reads
the file. So should Jmol. Standards aside...
Bob Hanson
</bob>

We have to be very tough on this (I am part of the COMCIFs
committee process and if there is a problem I will take it
up). There are two layered aspects:
- syntax. does it conform with the CIF specification. (This
is similar to whether an XML document is well-formed). The
first example didn't. It is therefore invalid. There are a
range of CIF checker tools on the IUCr site and if any of
them flag the CIF as invalid then IMO ICSD have the problem.
- semantics. This is whether the value is reasonable. (The
second example seems well formed at first glance)
Unfortunately there is less experience in developing
semantics tools for CIF. The It sounds as if the "CL1-" is
causing problems. This refers to an atom_type_symbol which
is defined as

Definition
        
   The code used to identify the atom specie(s) representing
this
   atom type. Normally this code is the element symbol. The code
   may be composed of any character except an underline with the
   additional proviso that digits designate an oxidation
state and
   must be followed by a + or - character.
 

It would be useful if CIF had used a regular expression for
this but... In any case "Cl1-" appears to be a semantically
valid label. So if this is the problem and if it is in Jmol
and if Jmol wishes to read this type of semantics then Jmol
needs adjusting.

Note that it is not easy to write code for transformation of
CIF semantics which is why I do it all in XML.

PeterMR


----------------------------------------------------------------------

Comment By: Miguel (migueljmol)
Date: 2004-07-14 04:23

Message:
Logged In: YES 
user_id=1050060

Please follow up with the ICSD folks and see what they say. 

Proliferation of support for *invalid* files is a major
problem. We don't do anybody any favors by being *flexible*
... it just comes back to bite you. 

The IUCr has worked very hard to try to promote the CIF
standard. And they claim to enforce (legally) that people
who claim to support CIF actually do so. 

If they have software that reads invalid CIF files then they
should be forced to fix it ... first with a gentle nudge and
then (if necessary) with a report to the CIF-police (IUCr)

Miguel


----------------------------------------------------------------------

Comment By: Miguel (migueljmol)
Date: 2004-07-12 05:01

Message:
Logged In: YES 
user_id=1050060

I am glad that PeterMR saw this ... I was going to send it
to him and ask him if it was valid. 

I do not believe that there is any way that we can reliably
read this file in the context of a CIF reader. I believe
that there is nothing special about newline characters
within the context of a data loop. That is, some files have
newline characters to separate the data values associated
with a single atom. Therefore, the reader cannot reliably
'guess' when to move to a new atom. 

I understand that you are going to ask ICSD about this file
... let's see what their stance is ...

Miguel


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2004-07-11 22:38

Message:
Logged In: NO 

I will contact ICSD and enquire, but it is quite possible
that they are doing exactly this. IMHO, their software reads
the file. So should Jmol. Standards aside...
Bob Hanson


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2004-07-11 12:00

Message:
Logged In: NO 


This is an invalid CIF. CIFs have two data structures, loops 
and items. In a loop like this all rows must have the same 
number of (whitespace-separated) fields. Note that the 
number of fields in each row and their semantics is given by 
the list of names in the loop whose length must equal each 
row.

For more information on CIF syntax, from which no 
deviations are allowed, see http://www.iucr.org

PeterMR

NB I doubt that the ICSD is emitting invalid CIFs on a 
systematic basis. 


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=379133&aid=988592&group_id=23629


-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click
_______________________________________________
Jmol-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/jmol-developers

Reply via email to