Robert,
My guess is that it is a problem with parsing templates when they are in
property values, as you already seem to have found out.

About your initial question:

Where is that "raw" wikipedia infobox dataset??


I imagine that people decided to abort outputting templates within the
values as they would look "broken" to the naked eye. I can already see what
kinds of questions would show up in the list saying that the extraction is
broken, when it would actually just be a raw representation of the content
in Wikipedia.

Would be nice to have some kind of RegexMapping...


Yes, I think that is a great idea! Unless one of the core developers steps
up to say that this is a bad idea, or that it won't work for some reason, I
would encourage you to give it a try and share your results with the list.

I think you've already gotten a hold of this, but just in case, all of the
code is available from here:
http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/extraction_framework/

Cheers,
Pablo

On Thu, Nov 10, 2011 at 2:50 AM, Robert Siemer
<[email protected]> wrote:

> What I actually want to query:
> -Android devices with a display of at least 800x480
>
>
> Out of obvious reasons I reduced that to:
> -list all android devices with display information included
>
>
> After a couple of days I came up with this query:
>
> SELECT DISTINCT ?subject, ?display {
> { ?subject <http://purl.org/dc/terms/subject>
> <http://dbpedia.org/resource/Category:Android_devices> . }
> UNION { ?subject a <http://dbpedia.org/class/yago/AndroidDevices> . }
> OPTIONAL { ?subject <http://dbpedia.org/property/display> ?display }
> }
>
>
> My problem:
>
> Where is that “raw” infobox dataset, which promises “complete coverage
> of all Wikipedia properties” with minimal clean-up? The downloadable
> infobox_properties file and the http://dbpedia.org/snorql/ sparql
> endpoint return only crap for the display property like: “4” or empty
> values! (Try the query yourself!)
>
> The live.dbpedia.org/sparql endpoint returns more, but still useless.
>
> I’m aware of the missing ontology mappings for the mobile phone infoboxes
> (http://en.wikipedia.org/wiki/Template:Infobox_mobile_phone), but:
> Should dbpedia live not import the raw values when there are no mappings?
> The wikipedia template uses micro-templates like
> {{convert|2.1|in|mm|abbr=on}}, how does dbpedia handle that?
>
> How does IntermediateNodeMapping separate the property string?? By
> spaces alone? Then how to handle this?
> | display = [[TFT LCD]], {{convert|3.2|in|mm|abbr=on}} diagonal <br />
> 320×480 px HVGA <br /> 1.5:1 aspect-ratio wide-screen <br /> 256K colors
>
> As far as I understand, CustomMappings are not implemented via
> media-wiki, is that right? – Would be nice to have some kind of
> RegexMapping, with:
> 1) a regular expression retrieves one or more values (named groups)
> 2) multiple regular expressions can be given
> 3) values retrieved can be subject to some mathematical/conditional
> cleanup (e.g. if first_var < second_var then “short_side” = first_var;
> “orientation” = portrait)
> 3b) and some more examples: if xyGA = HVGA then “short_side” = 320
> 3c) and maybe some extra calulations: “dpi” = sqrt(...+...)/...
>
>
> So, how do I get that display info out of dbpedia at all?
> And how to improve the situation for easy retrieval of both display
> dimensions?
>
>
> Thanks,
> Robert
>
>
> ------------------------------------------------------------------------------
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to