Hi Krzysztof,

(adding the dev list in cc)
This is something in my todo's for some time and can break in two cases:
a) inner templates that can be mapped from the mappings wiki
b) minor formatting templates

For (a) :
ATM we only extract information from top-level templates. what we can do is
to change the mapping extractor and extract information from all templates
that have a mapping
I think this would work but we have to test and see any undesired output.
The idea here is to map the inner templates in very high level classes
(probably owl:Thing) in order not to break the typing mechanism.
Something in this direction (not exactly) is the Authority control mapping
[1], although this is not nested, it is mapped to dbo:Agent in order not to
interfere with the possible main infobox template of its page that could be
either a Person or an organization.
This could also solve some of the problems we still have in the extraction
of commons.

In this case "templateProperty = 1" could be used as well as
ConditionalMappings [2]
I think this would give a very big boost in extracted data but not sure of
any sideeffects
Any opinions?

For (b) :
an easy option is to go at the code for very common templates, this is what
we did with commons and some templates in English [3] but this does not
scale much.
To move this functionality in the mappings wiki will be difficult because
each case is quite different but everything is possible ;)

Cheers,
Dimtiris

[1] http://mappings.dbpedia.org/index.php/Mapping_en:Authority_control
[2] http://mappings.dbpedia.org/index.php/Template:ConditionalMapping
[3]
https://github.com/jimkont/extraction-framework/blob/live_features/core/src/main/scala/org/dbpedia/extraction/config/transform/TemplateTransformConfig.scala#L76-115

On Wed, Sep 24, 2014 at 2:49 PM, Krzysztof Wecel <[email protected]>
wrote:

> Hi Dimitris,
>
> can you advise on possibilities of extraction of nested templates in
> DBpedia?
> Details in my e-mail below.
>
>
> Best regards,
> Krzysztof
>
>
>
> -------- Original Message --------
> Subject:        Re: DBpedia - sophisticated extraction
> Date:   Wed, 24 Sep 2014 12:35:29 +0200
> From:   Alexandru Todor <[email protected]>
> Reply-To:       [email protected]
> To:     Krzysztof Wecel <[email protected]>
> References:     <[email protected]>
>
>
>
> Hi Krzysztof,
>
> DBpedia can't handle nested templates, it's been an issue for years.
> Regarding the Commons extraction, you should ask this question on the
> mailing list, maybe Dimitris has more info on this issue.
>
> Cheers,
> Alexandru
>
>
> On 09/24/2014 07:24 AM, Krzysztof Wecel wrote:
> > Hi,
> >
> > I've quite a challenging template to extract. What I have found is that
> > you somehow managed to overcome license extraction problem from Commons
> > (mentioned by Dimitris), which looks similar to my problem.
> >
> > There are two issues:
> > 1. I have an embedded template
> > 2. The template is using positions of attributes, not names.
> >
> > For the second I assume one can use index instead of name:
> > "templateProperty = 1" (though it does notseem to work)
> >
> > Please let me know if the following is possible to extract using current
> > extraction framework:
> >
> > {{Super infobox
> >  |type                      = DK
> >  |country                     = PL
> > {{Legend|red|Highway 5}}
> > |points                    =
> > {{ABC|ok|A|1|e=175}}
> > {{ABC|ok|A|1|e=86}}
> > {{ABC|ok|K|91}}
> > {{XYZ|ok|WA|0|[[Oxygen]]|A|1|86}}
> > ...
> > }}
> >
> >
> >
> > Best regards,
> > Krzysztof
> >
>
>
>
> --
> Dr Krzysztof Wecel http://kie.ue.poznan.pl/en/member/krzysztof-wecel
> Department of Information Systems
> Poznan University of Economics, al. Niepodleglosci 10, 61-875 Poznan
> [email protected]    Tel:+48(61)854-3632  Fax:+48(61)854-3633
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to