Hi Julien,
This is the code for DBpedia Live. The English DBpedia uses an old branch
[1] but this code is improved and internationalized. For now it runs fine
for DBpedia Dutch (live.nl.dbpedia.org) and will be deployed for English
once I fix a few minor rough edges (Statistics generation & a changeset
generation bug).
I didn't have time to write a step-by-step guide for this. However, I could
help you deploy this for your language (btw, what language do you have in
mind?) and maybe you can help me document all the needed steps.
Cheers,
Dimitris
[1] https://github.com/dbpedia/extraction-framework/tree/live
On Mon, Apr 15, 2013 at 4:06 PM, Julien Plu <
[email protected]> wrote:
> I was exactly that, thank you very much Andrea :-)
>
> Now I will have another question, I saw during the Mavec build a tool
> called "Build DBPedia Live extraction 3.8" or something like that. Is-it
> the tool used for making a DBPedia live version ?
>
> If yes, where I can found some details on how to use it, and if it's
> possible to use it only for a language.
>
> Best.
>
> Julien.
>
>
>
>
> 2013/4/15 Andrea Di Menna <[email protected]>
>
>> Can you reach the above URL?
>> Are you behind a proxy which might block the connection? If this is the
>> case then you might want to uncomment the jvmArgs in the dump/pom.xml file
>> where the download launcher is defined.
>>
>> Cheers
>> Andrea
>>
>>
>> 2013/4/15 Julien Plu <[email protected]>
>>
>>> Yes it was that, so yes I think the doc should be review to add this
>>> detail. But now I have another problem while the tool try to download the
>>> file. Here the output :
>>>
>>> [INFO] launcher 'download' selected =>
>>> org.dbpedia.extraction.dump.download.Download
>>> done: 0 -
>>> todo: 2 - wiki=commons,locale=en,wiki=fr,locale=fr
>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> 1 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> 2 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> 3 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> 4 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> 5 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>> java.lang.reflect.InvocationTargetException
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:601)
>>> at
>>> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
>>> at
>>> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
>>> Caused by: java.net.UnknownHostException: dumps.wikimedia.org
>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> Method)
>>> at
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>> at
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1674)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1672)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1670)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
>>> at
>>> org.dbpedia.extraction.dump.download.FileDownloader$class.inputStream(FileDownloader.scala:65)
>>> at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>> $dbpedia$extraction$dump$download$Counter$$super$inputStream(Download.scala:29)
>>> at
>>> org.dbpedia.extraction.dump.download.Counter$class.inputStream(Counter.scala:23)
>>> at
>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.inputStream(Download.scala:29)
>>> at
>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadFile(FileDownloader.scala:49)
>>> at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>> $dbpedia$extraction$dump$download$LastModified$$super$downloadFile(Download.scala:29)
>>> at
>>> org.dbpedia.extraction.dump.download.LastModified$class.downloadFile(LastModified.scala:21)
>>> at
>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadFile(Download.scala:29)
>>> at
>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadFile(FileDownloader.scala:36)
>>> at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>> $dbpedia$extraction$dump$download$Retry$$super$downloadFile(Download.scala:29)
>>> at
>>> org.dbpedia.extraction.dump.download.Retry$class.downloadFile(Retry.scala:28)
>>> at
>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadFile(Download.scala:29)
>>> at
>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadTo(FileDownloader.scala:26)
>>> at
>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadTo(Download.scala:29)
>>> at
>>> org.dbpedia.extraction.dump.download.LanguageDownloader.downloadDates(LanguageDownloader.scala:35)
>>> at
>>> org.dbpedia.extraction.dump.download.Download$$anonfun$main$3.apply(Download.scala:67)
>>> at
>>> org.dbpedia.extraction.dump.download.Download$$anonfun$main$3.apply(Download.scala:62)
>>> at
>>> scala.collection.immutable.TreeSet$$anonfun$foreach$1.apply(TreeSet.scala:114)
>>> at
>>> scala.collection.immutable.TreeSet$$anonfun$foreach$1.apply(TreeSet.scala:114)
>>> at
>>> scala.collection.immutable.RedBlack$NonEmpty.foreach(RedBlack.scala:164)
>>> at
>>> scala.collection.immutable.RedBlack$NonEmpty.foreach(RedBlack.scala:163)
>>> at scala.collection.immutable.TreeSet.foreach(TreeSet.scala:114)
>>> at
>>> org.dbpedia.extraction.dump.download.Download$.main(Download.scala:62)
>>> at org.dbpedia.extraction.dump.download.Download.main(Download.scala)
>>> ... 6 more
>>> Caused by: java.net.UnknownHostException: dumps.wikimedia.org
>>> at
>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
>>> at java.net.Socket.connect(Socket.java:579)
>>> at java.net.Socket.connect(Socket.java:528)
>>> at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
>>> at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
>>> at sun.net.www.http.HttpClient.New(HttpClient.java:290)
>>> at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1299)
>>> at
>>> sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2677)
>>> at
>>> java.net.HttpURLConnection.getHeaderFieldDate(HttpURLConnection.java:539)
>>> at java.net.URLConnection.getLastModified(URLConnection.java:569)
>>> at
>>> org.dbpedia.extraction.dump.download.LastModified$class.downloadFile(LastModified.scala:17)
>>> ... 23 more
>>>
>>> Apparently the tool don't find the website.
>>>
>>> Best.
>>>
>>> Julien.
>>>
>>>
>>> 2013/4/15 Andrea Di Menna <[email protected]>
>>>
>>>> Hi Julien,
>>>> If I am not wrong you need to prepend the config file with the "config"
>>>> keyword I.e. config=download.minimal.properties
>>>>
>>>> Can you check? In that case we should review the documentation.
>>>>
>>>> Regards
>>>> Andrea
>>>> Il giorno 15/apr/2013 13:46, "Julien Plu" <
>>>> [email protected]> ha scritto:
>>>>
>>>>> Hi,
>>>>>
>>>>> I try to use the extraction framework for DBPedia explained here :
>>>>> https://github.com/dbpedia/extraction-framework/wiki/Dbpedia-Abstract-Extraction--Step-by-step
>>>>>
>>>>> And I have an error when the tool try to download the dump : "invalid
>>>>> argument 'download.minimal.properties'. And I don't see where in this
>>>>> file.
>>>>> So here a copy of my file :
>>>>>
>>>>>
>>>>> # NOTE: format is not java.util.Properties, but
>>>>> org.dbpedia.extraction.dump.download.DownloadConfig
>>>>>
>>>>> # Default download server. It lists mirrors which may be faster.
>>>>> base-url=http://dumps.wikimedia.org/
>>>>>
>>>>> # Replace by your target folder.
>>>>> base-dir=/home/jplu/extraction-framework/dump/dumps
>>>>>
>>>>> # Replace xx by your language.
>>>>> download=fr:pages-articles.xml.bz2
>>>>>
>>>>> # Only needed for the ImageExtractor
>>>>> download=commons:pages-articles.xml.bz2
>>>>>
>>>>> # Unzip files while downloading? Not necessary, extraction will unzip
>>>>> on the fly. Let's save space.
>>>>> unzip=false
>>>>>
>>>>> # Sometimes connecting to the server fails, so we try five times with
>>>>> pauses of 10 seconds.
>>>>> retry-max=5
>>>>> retry-millis=10000
>>>>>
>>>>> I want only download the fr version of the dump with the commons
>>>>> version. What I did badly ?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Best.
>>>>>
>>>>> Julien.
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Precog is a next-generation analytics platform capable of advanced
>>>>> analytics on semi-structured data. The platform includes APIs for
>>>>> building
>>>>> apps and a phenomenal toolset for data science. Developers can use
>>>>> our toolset for easy data analysis & visualization. Get a free account!
>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>> _______________________________________________
>>>>> Dbpedia-discussion mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>>>>
>>>>>
>>>
>>
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free account!
> http://www2.precog.com/precogplatform/slashdotnewsletter
> _______________________________________________
> Dbpedia-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>
>
--
Kontokostas Dimitris
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion