I'm preparing course material about querying DBpedia from a web page using 
Firefox and Greasemonkey, unpacking the payload received and patching the 
information into a web page.  My sample SPARQL query is for the state flowers 
of states of the United States, a query that is listed on the Meow meow meow 
blog at 
http://www.craigethomas.com/blog/2009/02/anatomy-of-a-sparql-query-part-1-select/
  

Strategies for unpacking the payload are complicated by unpredictable 
structural irregularities of the payload.  I was wondering if someone could 
suggest an explanation, or point out explanatory documentation that I could 
provide my students.

Most of the states have a predictable XML payload that is structured like this:

    <result>
      <binding name="state">
        <uri>http://dbpedia.org/resource/Mississippi</uri>
      </binding>
      <binding name="flower">
        <uri>http://dbpedia.org/resource/Magnolia_Blossom</uri>
      </binding>
    </result>

But West Virginia's state flower is structured as a literal with an embedded 
HTML tag:

   <literal xml:lang="en">Rhododendron&lt;br&gt;(''Rhododendron 
maximum'')</literal>

And Florida's state flower listing contains escape characters:

  <uri>http://dbpedia.org/resource/Orange_%28fruit%29</uri>

There is also the general problem of multiple listings.  For example, 
California is listed with the California_Poppy twice.

What is an explanation for these structural irregularities?

Thanks, Terry


Terrence Brooks
Information School
University of Washington
Voice: 206 543-2646
Fax: 206 616-3152
E-mail: [email protected]
Web: http://faculty.washington.edu/tabrooks/



Reply via email to