@Andrea : Thanks for this update, it's a bit more clear :-)
@Dimitri : I think about French version, and I don't know if Julien Cojan
is enough available for making it. So as I have time I can take care of
this project and obviously I will help you to write a step by step
documentation on "How to deploy a live version for a language".
@Julien : What do you think about my proposition ?
Best.
Julien.
2013/4/15 Andrea Di Menna <[email protected]>
> For your reference, I have edited a bit the Abstract Extraction
> step-by-step guide
>
>
> https://github.com/dbpedia/extraction-framework/wiki/Dbpedia-Abstract-Extraction-step-by-step-guide
>
> Cheers
> Andrea
>
>
> 2013/4/15 Dimitris Kontokostas <[email protected]>
>
>> Hi Julien,
>>
>> This is the code for DBpedia Live. The English DBpedia uses an old branch
>> [1] but this code is improved and internationalized. For now it runs fine
>> for DBpedia Dutch (live.nl.dbpedia.org) and will be deployed for English
>> once I fix a few minor rough edges (Statistics generation & a changeset
>> generation bug).
>>
>> I didn't have time to write a step-by-step guide for this. However, I
>> could help you deploy this for your language (btw, what language do you
>> have in mind?) and maybe you can help me document all the needed steps.
>>
>> Cheers,
>> Dimitris
>>
>>
>> [1] https://github.com/dbpedia/extraction-framework/tree/live
>>
>>
>> On Mon, Apr 15, 2013 at 4:06 PM, Julien Plu <
>> [email protected]> wrote:
>>
>>> I was exactly that, thank you very much Andrea :-)
>>>
>>> Now I will have another question, I saw during the Mavec build a tool
>>> called "Build DBPedia Live extraction 3.8" or something like that. Is-it
>>> the tool used for making a DBPedia live version ?
>>>
>>> If yes, where I can found some details on how to use it, and if it's
>>> possible to use it only for a language.
>>>
>>> Best.
>>>
>>> Julien.
>>>
>>>
>>>
>>>
>>> 2013/4/15 Andrea Di Menna <[email protected]>
>>>
>>>> Can you reach the above URL?
>>>> Are you behind a proxy which might block the connection? If this is the
>>>> case then you might want to uncomment the jvmArgs in the dump/pom.xml file
>>>> where the download launcher is defined.
>>>>
>>>> Cheers
>>>> Andrea
>>>>
>>>>
>>>> 2013/4/15 Julien Plu <[email protected]>
>>>>
>>>>> Yes it was that, so yes I think the doc should be review to add this
>>>>> detail. But now I have another problem while the tool try to download the
>>>>> file. Here the output :
>>>>>
>>>>> [INFO] launcher 'download' selected =>
>>>>> org.dbpedia.extraction.dump.download.Download
>>>>> done: 0 -
>>>>> todo: 2 - wiki=commons,locale=en,wiki=fr,locale=fr
>>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> 1 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> 2 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> 3 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> 4 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> 5 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>>> java.lang.reflect.InvocationTargetException
>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>> at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> at java.lang.reflect.Method.invoke(Method.java:601)
>>>>> at
>>>>> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
>>>>> at
>>>>> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
>>>>> Caused by: java.net.UnknownHostException: dumps.wikimedia.org
>>>>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>>>> Method)
>>>>> at
>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>>>> at
>>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>> at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1674)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1672)
>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1670)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.inputStream(FileDownloader.scala:65)
>>>>> at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>>>> $dbpedia$extraction$dump$download$Counter$$super$inputStream(Download.scala:29)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Counter$class.inputStream(Counter.scala:23)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.inputStream(Download.scala:29)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadFile(FileDownloader.scala:49)
>>>>> at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>>>> $dbpedia$extraction$dump$download$LastModified$$super$downloadFile(Download.scala:29)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.LastModified$class.downloadFile(LastModified.scala:21)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadFile(Download.scala:29)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadFile(FileDownloader.scala:36)
>>>>> at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>>>> $dbpedia$extraction$dump$download$Retry$$super$downloadFile(Download.scala:29)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Retry$class.downloadFile(Retry.scala:28)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadFile(Download.scala:29)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadTo(FileDownloader.scala:26)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadTo(Download.scala:29)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.LanguageDownloader.downloadDates(LanguageDownloader.scala:35)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download$$anonfun$main$3.apply(Download.scala:67)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download$$anonfun$main$3.apply(Download.scala:62)
>>>>> at
>>>>> scala.collection.immutable.TreeSet$$anonfun$foreach$1.apply(TreeSet.scala:114)
>>>>> at
>>>>> scala.collection.immutable.TreeSet$$anonfun$foreach$1.apply(TreeSet.scala:114)
>>>>> at
>>>>> scala.collection.immutable.RedBlack$NonEmpty.foreach(RedBlack.scala:164)
>>>>> at
>>>>> scala.collection.immutable.RedBlack$NonEmpty.foreach(RedBlack.scala:163)
>>>>> at scala.collection.immutable.TreeSet.foreach(TreeSet.scala:114)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download$.main(Download.scala:62)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.Download.main(Download.scala)
>>>>> ... 6 more
>>>>> Caused by: java.net.UnknownHostException: dumps.wikimedia.org
>>>>> at
>>>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
>>>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
>>>>> at java.net.Socket.connect(Socket.java:579)
>>>>> at java.net.Socket.connect(Socket.java:528)
>>>>> at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>>>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
>>>>> at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
>>>>> at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
>>>>> at sun.net.www.http.HttpClient.New(HttpClient.java:290)
>>>>> at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1299)
>>>>> at
>>>>> sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2677)
>>>>> at
>>>>> java.net.HttpURLConnection.getHeaderFieldDate(HttpURLConnection.java:539)
>>>>> at java.net.URLConnection.getLastModified(URLConnection.java:569)
>>>>> at
>>>>> org.dbpedia.extraction.dump.download.LastModified$class.downloadFile(LastModified.scala:17)
>>>>> ... 23 more
>>>>>
>>>>> Apparently the tool don't find the website.
>>>>>
>>>>> Best.
>>>>>
>>>>> Julien.
>>>>>
>>>>>
>>>>> 2013/4/15 Andrea Di Menna <[email protected]>
>>>>>
>>>>>> Hi Julien,
>>>>>> If I am not wrong you need to prepend the config file with the
>>>>>> "config" keyword I.e. config=download.minimal.properties
>>>>>>
>>>>>> Can you check? In that case we should review the documentation.
>>>>>>
>>>>>> Regards
>>>>>> Andrea
>>>>>> Il giorno 15/apr/2013 13:46, "Julien Plu" <
>>>>>> [email protected]> ha scritto:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I try to use the extraction framework for DBPedia explained here :
>>>>>>> https://github.com/dbpedia/extraction-framework/wiki/Dbpedia-Abstract-Extraction--Step-by-step
>>>>>>>
>>>>>>> And I have an error when the tool try to download the dump :
>>>>>>> "invalid argument 'download.minimal.properties'. And I don't see where
>>>>>>> in
>>>>>>> this file. So here a copy of my file :
>>>>>>>
>>>>>>>
>>>>>>> # NOTE: format is not java.util.Properties, but
>>>>>>> org.dbpedia.extraction.dump.download.DownloadConfig
>>>>>>>
>>>>>>> # Default download server. It lists mirrors which may be faster.
>>>>>>> base-url=http://dumps.wikimedia.org/
>>>>>>>
>>>>>>> # Replace by your target folder.
>>>>>>> base-dir=/home/jplu/extraction-framework/dump/dumps
>>>>>>>
>>>>>>> # Replace xx by your language.
>>>>>>> download=fr:pages-articles.xml.bz2
>>>>>>>
>>>>>>> # Only needed for the ImageExtractor
>>>>>>> download=commons:pages-articles.xml.bz2
>>>>>>>
>>>>>>> # Unzip files while downloading? Not necessary, extraction will
>>>>>>> unzip on the fly. Let's save space.
>>>>>>> unzip=false
>>>>>>>
>>>>>>> # Sometimes connecting to the server fails, so we try five times
>>>>>>> with pauses of 10 seconds.
>>>>>>> retry-max=5
>>>>>>> retry-millis=10000
>>>>>>>
>>>>>>> I want only download the fr version of the dump with the commons
>>>>>>> version. What I did badly ?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Best.
>>>>>>>
>>>>>>> Julien.
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> Precog is a next-generation analytics platform capable of advanced
>>>>>>> analytics on semi-structured data. The platform includes APIs for
>>>>>>> building
>>>>>>> apps and a phenomenal toolset for data science. Developers can use
>>>>>>> our toolset for easy data analysis & visualization. Get a free
>>>>>>> account!
>>>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>> _______________________________________________
>>>>>>> Dbpedia-discussion mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Precog is a next-generation analytics platform capable of advanced
>>> analytics on semi-structured data. The platform includes APIs for
>>> building
>>> apps and a phenomenal toolset for data science. Developers can use
>>> our toolset for easy data analysis & visualization. Get a free account!
>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>> _______________________________________________
>>> Dbpedia-discussion mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>>
>>>
>>
>>
>> --
>> Kontokostas Dimitris
>>
>
>
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion