This problem can be solved by using the errors='ignore'parameter of
codecs.open.  It may be that some other codec would decode the
characters coming from the Excel Spreadsheet correctly, but I could not
find the correct one.

import codecs
store.load(codecs.open("Test.rdf",'r','utf8',errors='ignore')) 

Dave J

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 16, 2007 9:00 AM
To: dev@rdflib.net
Subject: Dev Digest, Vol 19, Issue 4

Send Dev mailing list submissions to
        dev@rdflib.net

To subscribe or unsubscribe via the World Wide Web, visit
        http://rdflib.net/mailman/listinfo/dev
or, via email, send a message with subject or body 'help' to
        [EMAIL PROTECTED]

You can reach the person managing the list at
        [EMAIL PROTECTED]

When replying, please edit your Subject line so it is more specific than
"Re: Contents of Dev digest..."


Today's Topics:

   1. SAX invalid token error when parsing property values      outside
      ASCII range. (Jones, David H)


----------------------------------------------------------------------

Message: 1
Date: Mon, 15 Oct 2007 13:10:09 -0700
From: "Jones, David H" <[EMAIL PROTECTED]>
Subject: [rdflib-dev] SAX invalid token error when parsing property
        values  outside ASCII range.
To: <dev@rdflib.net>
Message-ID:
        
<[EMAIL PROTECTED]>
        
Content-Type: text/plain;       charset="us-ascii"

I am attempting to process rdf that has characters outside the ASCII
range, and am getting a SAXParseException: not well-formed (invalid
token)

Call:

store = ConjunctiveGraph()
store.load("ToolsTestA0Removed.rdf") 

I thought this might be corrected by adding the encoding tot the top of
the file:

<?xml version='1.0' encoding='UTF-8'?>

But this did not correct the problem.

Is there a parsing option that I've missed, or some other error I'm
making? Will utf-8 encoding work for characters like hex A0 or hex 92?

Thanks in advance for help

Dave J

Trace:


Traceback (most recent call last):
  File "C:\nbo\rdf2Forms.py", line 18, in <module>
    store.load("endpoint/ToolsTestA0Removed.rdf")  # Saved by
makeTriples.py.
  File "build\bdist.win32\egg\rdflib\Graph.py", line 665, in load
    self.parse(source, publicID, format)
  File "build\bdist.win32\egg\rdflib\Graph.py", line 828, in parse
    context.parse(source, publicID=publicID, format=format, **args)
  File "build\bdist.win32\egg\rdflib\Graph.py", line 661, in parse
    parser.parse(source, self, **args)
  File "build\bdist.win32\egg\rdflib\syntax\parsers\RDFXMLParser.py",
line 37, in parse
    self._parser.parse(source)
  File "c:\python25\lib\xml\sax\expatreader.py", line 107, in parse
    xmlreader.IncrementalParser.parse(self, source)
  File "c:\python25\lib\xml\sax\xmlreader.py", line 123, in parse
    self.feed(buffer)
  File "c:\python25\lib\xml\sax\expatreader.py", line 211, in feed
    self._err_handler.fatalError(exc)
  File "c:\python25\lib\xml\sax\handler.py", line 38, in fatalError
    raise exception
SAXParseException: file:///C|/ToolsTestA0Removed.rdf:373:684: not
well-formed (invalid token)






------------------------------

_______________________________________________
Dev mailing list
Dev@rdflib.net
http://rdflib.net/mailman/listinfo/dev


End of Dev Digest, Vol 19, Issue 4
**********************************

_______________________________________________
Dev mailing list
Dev@rdflib.net
http://rdflib.net/mailman/listinfo/dev

Reply via email to