For your reference, I have edited a bit the Abstract Extraction
step-by-step guide

https://github.com/dbpedia/extraction-framework/wiki/Dbpedia-Abstract-Extraction-step-by-step-guide

Cheers
Andrea


2013/4/15 Dimitris Kontokostas <jimk...@gmail.com>

> Hi Julien,
>
> This is the code for DBpedia Live. The English DBpedia uses an old branch
> [1] but this code is improved and internationalized. For now it runs fine
> for DBpedia Dutch (live.nl.dbpedia.org) and will be deployed for English
> once I fix a few minor rough edges (Statistics generation & a changeset
> generation bug).
>
> I didn't have time to write a step-by-step guide for this. However, I
> could help you deploy this for your language (btw, what language do you
> have in mind?) and maybe you can help me document all the needed steps.
>
> Cheers,
> Dimitris
>
>
> [1] https://github.com/dbpedia/extraction-framework/tree/live
>
>
> On Mon, Apr 15, 2013 at 4:06 PM, Julien Plu <
> julien....@redaction-developpez.com> wrote:
>
>> I was exactly that, thank you very much Andrea :-)
>>
>> Now I will have another question, I saw during the Mavec build a tool
>> called "Build DBPedia Live extraction 3.8" or something like that. Is-it
>> the tool used for making a DBPedia live version ?
>>
>> If yes, where I can found some details on how to use it, and if it's
>> possible to use it only for a language.
>>
>> Best.
>>
>> Julien.
>>
>>
>>
>>
>> 2013/4/15 Andrea Di Menna <ninn...@gmail.com>
>>
>>> Can you reach the above URL?
>>> Are you behind a proxy which might block the connection? If this is the
>>> case then you might want to uncomment the jvmArgs in the dump/pom.xml file
>>> where the download launcher is defined.
>>>
>>> Cheers
>>> Andrea
>>>
>>>
>>> 2013/4/15 Julien Plu <julien....@redaction-developpez.com>
>>>
>>>> Yes it was that, so yes I think the doc should be review to add this
>>>> detail. But now I have another problem while the tool try to download the
>>>> file. Here the output :
>>>>
>>>> [INFO] launcher 'download' selected =>
>>>> org.dbpedia.extraction.dump.download.Download
>>>> done: 0 -
>>>> todo: 2 - wiki=commons,locale=en,wiki=fr,locale=fr
>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> 1 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> 2 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> 3 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> 4 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>> downloading 'http://dumps.wikimedia.org/commonswiki/' to
>>>> '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> 5 of 5 attempts to download 'http://dumps.wikimedia.org/commonswiki/'
>>>> to '/home/jplu/extraction-framework/dump/dumps/commonswiki/index.html'
>>>> failed - java.net.UnknownHostException: dumps.wikimedia.org
>>>> java.lang.reflect.InvocationTargetException
>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>     at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>     at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>     at java.lang.reflect.Method.invoke(Method.java:601)
>>>>     at
>>>> org_scala_tools_maven_executions.MainHelper.runMain(MainHelper.java:161)
>>>>     at
>>>> org_scala_tools_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
>>>> Caused by: java.net.UnknownHostException: dumps.wikimedia.org
>>>>     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>>> Method)
>>>>     at
>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>>>     at
>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>     at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1674)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1672)
>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1670)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1243)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.inputStream(FileDownloader.scala:65)
>>>>     at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>>> $dbpedia$extraction$dump$download$Counter$$super$inputStream(Download.scala:29)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Counter$class.inputStream(Counter.scala:23)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.inputStream(Download.scala:29)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadFile(FileDownloader.scala:49)
>>>>     at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>>> $dbpedia$extraction$dump$download$LastModified$$super$downloadFile(Download.scala:29)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.LastModified$class.downloadFile(LastModified.scala:21)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadFile(Download.scala:29)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadFile(FileDownloader.scala:36)
>>>>     at org.dbpedia.extraction.dump.download.Download$Downloader$1.org
>>>> $dbpedia$extraction$dump$download$Retry$$super$downloadFile(Download.scala:29)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Retry$class.downloadFile(Retry.scala:28)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadFile(Download.scala:29)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.FileDownloader$class.downloadTo(FileDownloader.scala:26)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download$Downloader$1.downloadTo(Download.scala:29)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.LanguageDownloader.downloadDates(LanguageDownloader.scala:35)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download$$anonfun$main$3.apply(Download.scala:67)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download$$anonfun$main$3.apply(Download.scala:62)
>>>>     at
>>>> scala.collection.immutable.TreeSet$$anonfun$foreach$1.apply(TreeSet.scala:114)
>>>>     at
>>>> scala.collection.immutable.TreeSet$$anonfun$foreach$1.apply(TreeSet.scala:114)
>>>>     at
>>>> scala.collection.immutable.RedBlack$NonEmpty.foreach(RedBlack.scala:164)
>>>>     at
>>>> scala.collection.immutable.RedBlack$NonEmpty.foreach(RedBlack.scala:163)
>>>>     at scala.collection.immutable.TreeSet.foreach(TreeSet.scala:114)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download$.main(Download.scala:62)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.Download.main(Download.scala)
>>>>     ... 6 more
>>>> Caused by: java.net.UnknownHostException: dumps.wikimedia.org
>>>>     at
>>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
>>>>     at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
>>>>     at java.net.Socket.connect(Socket.java:579)
>>>>     at java.net.Socket.connect(Socket.java:528)
>>>>     at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
>>>>     at sun.net.www.http.HttpClient.openServer(HttpClient.java:378)
>>>>     at sun.net.www.http.HttpClient.openServer(HttpClient.java:473)
>>>>     at sun.net.www.http.HttpClient.<init>(HttpClient.java:203)
>>>>     at sun.net.www.http.HttpClient.New(HttpClient.java:290)
>>>>     at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:995)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:931)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:849)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1299)
>>>>     at
>>>> sun.net.www.protocol.http.HttpURLConnection.getHeaderField(HttpURLConnection.java:2677)
>>>>     at
>>>> java.net.HttpURLConnection.getHeaderFieldDate(HttpURLConnection.java:539)
>>>>     at java.net.URLConnection.getLastModified(URLConnection.java:569)
>>>>     at
>>>> org.dbpedia.extraction.dump.download.LastModified$class.downloadFile(LastModified.scala:17)
>>>>     ... 23 more
>>>>
>>>> Apparently the tool don't find the website.
>>>>
>>>> Best.
>>>>
>>>> Julien.
>>>>
>>>>
>>>> 2013/4/15 Andrea Di Menna <ninn...@gmail.com>
>>>>
>>>>> Hi Julien,
>>>>> If I am not wrong you need to prepend the config file with the
>>>>> "config" keyword I.e. config=download.minimal.properties
>>>>>
>>>>> Can you check? In that case we should review the documentation.
>>>>>
>>>>> Regards
>>>>> Andrea
>>>>> Il giorno 15/apr/2013 13:46, "Julien Plu" <
>>>>> julien....@redaction-developpez.com> ha scritto:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I try to use the extraction framework for DBPedia explained here :
>>>>>> https://github.com/dbpedia/extraction-framework/wiki/Dbpedia-Abstract-Extraction--Step-by-step
>>>>>>
>>>>>> And I have an error when the tool try to download the dump : "invalid
>>>>>> argument 'download.minimal.properties'. And I don't see where in this 
>>>>>> file.
>>>>>> So here a copy of my file :
>>>>>>
>>>>>>
>>>>>> # NOTE: format is not java.util.Properties, but
>>>>>> org.dbpedia.extraction.dump.download.DownloadConfig
>>>>>>
>>>>>> # Default download server. It lists mirrors which may be faster.
>>>>>> base-url=http://dumps.wikimedia.org/
>>>>>>
>>>>>> # Replace by your target folder.
>>>>>> base-dir=/home/jplu/extraction-framework/dump/dumps
>>>>>>
>>>>>> # Replace xx by your language.
>>>>>> download=fr:pages-articles.xml.bz2
>>>>>>
>>>>>> # Only needed for the ImageExtractor
>>>>>> download=commons:pages-articles.xml.bz2
>>>>>>
>>>>>> # Unzip files while downloading? Not necessary, extraction will unzip
>>>>>> on the fly. Let's save space.
>>>>>> unzip=false
>>>>>>
>>>>>> # Sometimes connecting to the server fails, so we try five times with
>>>>>> pauses of 10 seconds.
>>>>>> retry-max=5
>>>>>> retry-millis=10000
>>>>>>
>>>>>> I want only download the fr version of the dump with the commons
>>>>>> version. What I did badly ?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Best.
>>>>>>
>>>>>> Julien.
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Precog is a next-generation analytics platform capable of advanced
>>>>>> analytics on semi-structured data. The platform includes APIs for
>>>>>> building
>>>>>> apps and a phenomenal toolset for data science. Developers can use
>>>>>> our toolset for easy data analysis & visualization. Get a free
>>>>>> account!
>>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>> _______________________________________________
>>>>>> Dbpedia-discussion mailing list
>>>>>> Dbpedia-discussion@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>>>>>
>>>>>>
>>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Precog is a next-generation analytics platform capable of advanced
>> analytics on semi-structured data. The platform includes APIs for building
>> apps and a phenomenal toolset for data science. Developers can use
>> our toolset for easy data analysis & visualization. Get a free account!
>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> Dbpedia-discussion@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>>
>
>
> --
> Kontokostas Dimitris
>
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to