This problem can be solved by using the errors='ignore'parameter of codecs.open. It may be that some other codec would decode the characters coming from the Excel Spreadsheet correctly, but I could not find the correct one.
import codecs store.load(codecs.open("Test.rdf",'r','utf8',errors='ignore')) Dave J -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 16, 2007 9:00 AM To: dev@rdflib.net Subject: Dev Digest, Vol 19, Issue 4 Send Dev mailing list submissions to dev@rdflib.net To subscribe or unsubscribe via the World Wide Web, visit http://rdflib.net/mailman/listinfo/dev or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED] You can reach the person managing the list at [EMAIL PROTECTED] When replying, please edit your Subject line so it is more specific than "Re: Contents of Dev digest..." Today's Topics: 1. SAX invalid token error when parsing property values outside ASCII range. (Jones, David H) ---------------------------------------------------------------------- Message: 1 Date: Mon, 15 Oct 2007 13:10:09 -0700 From: "Jones, David H" <[EMAIL PROTECTED]> Subject: [rdflib-dev] SAX invalid token error when parsing property values outside ASCII range. To: <dev@rdflib.net> Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset="us-ascii" I am attempting to process rdf that has characters outside the ASCII range, and am getting a SAXParseException: not well-formed (invalid token) Call: store = ConjunctiveGraph() store.load("ToolsTestA0Removed.rdf") I thought this might be corrected by adding the encoding tot the top of the file: <?xml version='1.0' encoding='UTF-8'?> But this did not correct the problem. Is there a parsing option that I've missed, or some other error I'm making? Will utf-8 encoding work for characters like hex A0 or hex 92? Thanks in advance for help Dave J Trace: Traceback (most recent call last): File "C:\nbo\rdf2Forms.py", line 18, in <module> store.load("endpoint/ToolsTestA0Removed.rdf") # Saved by makeTriples.py. File "build\bdist.win32\egg\rdflib\Graph.py", line 665, in load self.parse(source, publicID, format) File "build\bdist.win32\egg\rdflib\Graph.py", line 828, in parse context.parse(source, publicID=publicID, format=format, **args) File "build\bdist.win32\egg\rdflib\Graph.py", line 661, in parse parser.parse(source, self, **args) File "build\bdist.win32\egg\rdflib\syntax\parsers\RDFXMLParser.py", line 37, in parse self._parser.parse(source) File "c:\python25\lib\xml\sax\expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "c:\python25\lib\xml\sax\xmlreader.py", line 123, in parse self.feed(buffer) File "c:\python25\lib\xml\sax\expatreader.py", line 211, in feed self._err_handler.fatalError(exc) File "c:\python25\lib\xml\sax\handler.py", line 38, in fatalError raise exception SAXParseException: file:///C|/ToolsTestA0Removed.rdf:373:684: not well-formed (invalid token) ------------------------------ _______________________________________________ Dev mailing list Dev@rdflib.net http://rdflib.net/mailman/listinfo/dev End of Dev Digest, Vol 19, Issue 4 ********************************** _______________________________________________ Dev mailing list Dev@rdflib.net http://rdflib.net/mailman/listinfo/dev