I started the code for the extractor and I have a problem with the regex in
Scala. the string is :
http://fr.wikipedia.org/w/index.php?title=Mod%C3%A8le:Donn%C3%A9es/Antony/%C3%A9volution_population&action=edit
And my regex is : val populationRegex = """|pop=(\d+)""".r
And I use this piece of code :
populationRegex findAllIn page.children.toString foreach (_ match {
case populationRegex (pop) => println(page.title.decoded + " : pop : "
+ param)
case _ =>
})
And instead of to get : "Données/Antony/évolution population : pop : 61793"
just once
I have many : "Données/Antony/évolution population : pop : null" as much as
there is line in the string
An idea of what I do wrongly ?
I'm totally beginner in Scala :-( sorry.
Best.
Julien.
2013/4/22 Jona Christopher Sahnwaldt <[email protected]>
> The templates where data is stored are not used directly in the main
> pages. It's a complicated process: page Toulouse uses template X, X uses Y,
> Y uses Z, and Z contains the data. Something like that, I'm 100% sure, but
> the details don't matter. This means that wikiPageUsesTemplate and
> InfoboxExtractor won't help.
>
> Generating a separate file is probably the best idea. We could also send
> these new triples to the main mapping based file, but that might be
> confusing: first, they're not mapping based; second, new triples about a
> city would be added in a completely different place in the file. (That's
> not a big problem though.)
>
> Cheers,
> JC
>
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion