Re: [rdflib-dev] TriX serializer and a TriX parser fix

Mikael Högqvist Thu, 09 Aug 2007 09:11:53 -0700

Hi,

since the SPARQL xml-writer didnt work for me either I re-made the
implementation using elementtree (attached). For versions before 2.5
it is possible to fall-back on the ElementTree package. This code has
been used with rdflib 2.4.0 and its probably broken on trunk. I could
test a variant of this against trunk if anyone is interested?


Cheers,
Mikael

On 09/08/07, Chimezie Ogbuji <[EMAIL PROTECTED]> wrote:
> On 8/9/07, whit <[EMAIL PROTECTED]> wrote:
> > +1 (to elementree and to the use of lxml in plugins)
> >
> > I would encourage introducing no new dependencies outside of what can be
> > isolated inside a plugin (dunno if john's changes fall into that category)
> > and disable any related tests if the dependency is unavailable.   Special
> > dependency code also should not hide the fallback implementation from tests;
> >  a conditional import rendered the standard sparql xmlwriter and it's
> > brokeness hidden from those with Ft installed for a quite awhile on trunk.
>
> FYI: The brokeness of the 'standard' sparql xmlwriter had all to do
> with the poor capability of native Python SAX writing - which couldn't
> even handle namespaces properly! I was also unable to use RDFLib's
> XMLWriter for that purpose as well - for reasons similar to John's.
>
> I have been working with an updated version of a harness for the (new)
> W3C DAWG tests for SPARQL but have not checked it in (and probably
> won't) mostly because it required additional XML processing I was
> simply unable to perform with 'native' Python: comparing SPARQL
> results against expected test results, for instance.  In addition, I
> also needed Ft.Lib.Uri to do 'proper' RFC-compliant resolution of URIs
> (for which the 'native' capabilities of urllib/urllib2 are broken in
> many regards).  URI resolution is a *major* component of SPARQL.
>
> I'm not so familiar with ElementTree or lxml (mostly because I've been
> spoiled by 4Suite for so long).  A seperate question: a dependence on
> ElementTree bumps the overall dependency to Py2.5, is that where we
> want to be? If so, I would  be curious to find out what aspects of XML
> processing can be done with lxml and which of those are crucial to RDF
> processing.
> _______________________________________________
> Dev mailing list
> Dev@rdflib.net
> http://rdflib.net/mailman/listinfo/dev
>

# -*- coding: iso-8859-15 -*-
# (c) Mikael HÃ¶gqvist, ZIB, AstroGrid-D
# This software is licensed under the software license specified at
# http://www.gac-grid.org/

# this is a work-around of the SPARQL XML-serialization in rdflib which does
# not work on all installation due to a bug in the python sax-parser
# We rely on ElementTree which is only available in Python 2.5

from cStringIO import StringIO

try:
    from xml.etree.cElementTree import Element, SubElement, ElementTree, ProcessingInstruction
    import xml.etree.cElementTree as ET
except ImportError:
    from cElementTree import Element, SubElement, ElementTree
    import cElementTree as ET

from rdflib import URIRef, BNode, Literal

SPARQL_XML_NAMESPACE = u'http://www.w3.org/2005/sparql-results#'
XML_NAMESPACE = "http://www.w3.org/2001/XMLSchema#";

name = lambda elem: u'{%s}%s' % (SPARQL_XML_NAMESPACE, elem)
xml_name = lambda elem: u'{%s}%s' % (XML_NAMESPACE, elem)

def variables(results):
    # don't include any variables which are not part of the
    # result set
    #res_vars = set(results.selectionF).intersection(set(results.allVariables))
    
    
    # this means select *, use all variables from the result-set
    if len(results.selectionF) == 0:
        res_vars = results.allVariables
    else:
        res_vars = [v for v in results.selectionF if v in results.allVariables]
        
    return res_vars
    
def header(results, root):
    head = SubElement(root, name(u'head'))
    
    res_vars = variables(results)    
    for var in res_vars:
        v = SubElement(head, name(u'variable'))
        # remove the ?
        v.attrib[u'name'] = var[1:]

        
def binding(val, var, elem):
    bindingElem = SubElement(elem, name(u'binding'))
    bindingElem.attrib[u'name'] = var
    
    if isinstance(val,URIRef):
        varElem = SubElement(bindingElem, name(u'uri'))
    elif isinstance(val,BNode) :
        varElem = SubElement(bindingElem, name(u'bnode'))
    elif isinstance(val,Literal):
        varElem = SubElement(bindingElem, name(u'literal'))
        
        if val.language != "" and val.language != None:
            varElem.attrib[xml_name(u'lang')] = str(val.language)
        elif val.datatype != "" and val.datatype != None:
            varElem.attrib[name(u'datatype')] = str(val.datatype)

    varElem.text = str(val)

def res_iter(results):
    res_vars = variables(results)
    
    for row in results.selected:
        if len(res_vars) == 1:
            row = (row, )
        
        yield zip(row, res_vars)
              
def result_list(results, root):
    resultsElem = SubElement(root, name(u'results'))
    
    ordered = results.orderBy
    
    if ordered == None:
        ordered = False
    
    # removed according to the new working draft (2007-06-14)    
    # resultsElem.attrib[u'ordered'] = str(ordered)
    # resultsElem.attrib[u'distinct'] = str(results.distinct)

    for row in res_iter(results):
        resultElem = SubElement(resultsElem, name(u'result'))
        # remove the ? from the variable name
        [binding(val, var[1:], resultElem) for (val, var) in row] 
    
def serializeXML(results):    
    root = Element(name(u'sparql'))
    
    header(results, root)
    result_list(results, root)
    
    out = StringIO()
    tree = ElementTree(root)

    # xml declaration must be written by hand
    # http://www.nabble.com/Writing-XML-files-with-ElementTree-t3433325.html
    out.write('<?xml version="1.0" encoding="utf-8"?>')
    out.write('<?xml-stylesheet type="text/xsl" href="/static/sparql-xml-to-html.xsl"?>')
    tree.write(out, encoding='utf-8')
    
    return out.getvalue()

_______________________________________________
Dev mailing list
Dev@rdflib.net
http://rdflib.net/mailman/listinfo/dev

Re: [rdflib-dev] TriX serializer and a TriX parser fix

Reply via email to