Re: [Dbpedia-discussion] .bz2 problem

Ahmed Ktob Sun, 21 Apr 2013 13:32:13 -0700

OK Jona, I will try this. Thank you.


On 21 April 2013 20:18, Jona Christopher Sahnwaldt <[email protected]> wrote:

> On 21 April 2013 19:46, Ahmed Ktob <[email protected]> wrote:
> > Well, first I should mention that I am using Intellij IDEA within
> Windows 7,
> > I can't try now on Linux because my works on Windows and I haven't enough
> > free space ))
> >
> > Also I am following this tutorial [1] to accomplish the Abstract
> Extraction.
> > I followed it until when it comes to importing data, it didn't work for
> me
> > with the error :
> >
> > java.lang.IllegalArgumentException: found no directory
> > C:\Users\AHMED\Desktop\arwiki/[YYYYMMDD] containing file
> > arwiki-[YYYYMMDD]-pages-articles.xml
> >
> > So I started reading the Import.Scala code and I figured maybe if I
> changed
> > the code :
> >
> > val tagFile = if (requireComplete) Download.Complete else
> > "pages-articles.xml"
> > val date = finder.dates(tagFile).last
> > val file = finder.file(date, "pages-articles.xml")
> >
> > to  "pages-articles.xml.bz2" maybe it will work. I did it and it worked
> (I
> > passed this step).
>
> Please pull the latest version from github. Let git overwrite your
> changes in Import.scala. Maybe git can merge your changes in pom.xml
> (your folder) with the new parmeter (dump file name).
>
> >
> > After the answer of Dimitris, I redo my changes and uncomment the source
> as
> > he mentioned in both extraction.abstracts.properties &
> > extraction.default.properties but I couldn't pass this step (the same
> error
> > above).
> >
> > I am using Maven 3.0.4, and to start Maven I just followed the guide :
> > clean -> install (on Parent Pom of the DBPedia framework)
> > Scala:run (on DBpedia Dump Extraction)
> >
> > Currently, I want just the default extraction not the abstract, but I
> can't
> > find a guide. Any suggestion ?
>
>
> https://github.com/dbpedia/extraction-framework/wiki/Extraction-Instructions
>
> >
> > Thank you so much.
> >
> > Cheers,
> > Ahmed.
> >
> > [1]
> >
> https://github.com/dbpedia/extraction-framework/wiki/Dbpedia-Abstract-Extraction-step-by-step-guide
> >
> >
> > On 21 April 2013 18:19, Jona Christopher Sahnwaldt <[email protected]>
> wrote:
> >>
> >> Ahmed,
> >>
> >> if things still don't work for you, please tell us exactly what you
> >> are trying to do: which Maven launcher? How do you start it? Please
> >> attach a copy of the configuration files and Scala files that you
> >> edited and a text file containing the complete Maven output.
> >>
> >> Cheers,
> >> JC
> >>
> >> On 21 April 2013 19:17, Jona Christopher Sahnwaldt <[email protected]>
> >> wrote:
> >> > Hi,
> >> >
> >> > Dimitris is right. Ahmed was referring to Import.scala, but that's
> >> > probably not what's causing the problem.
> >> >
> >> > Ahmed, please try to edit the config file as Dimitris said and the
> >> > extraction should work. You only need Import.scala if you want to
> >> > extract abstracts.
> >> >
> >> > Anyway, I just added some code to make Import.scala more flexible. I
> >> > also added a new argument in dump/pom.xml: users can now specify the
> >> > name of the XML dump file, and Import.scala will automatically unzip
> >> > if the suffix is .gz or .bz2.
> >> >
> >> > If you encouter any problems, let us know.
> >> >
> >> > Cheers,
> >> > JC
> >> >
> >> > On 21 April 2013 18:08, Jona Christopher Sahnwaldt <[email protected]>
> >> > wrote:
> >> >> Hi,
> >> >>
> >> >> hm, no, sorry, in this case that won't work. The Import class is not
> >> >> configurable enough. I think Import.scala can't handle zipped files
> at
> >> >> all, so changing the name won't help either. I'll have a look, maybe
> I
> >> >> can fix this quickly.
> >> >>
> >> >> Cheers,
> >> >> JC
> >> >>
> >> >> On 21 April 2013 18:00, Dimitris Kontokostas <[email protected]>
> wrote:
> >> >>> Hi Ahmed,
> >> >>>
> >> >>> in the default configuration files you will find the following lines
> >> >>> # default:
> >> >>> # source=pages-articles.xml
> >> >>>
> >> >>> # alternatives:
> >> >>> # source=pages-articles.xml.bz2
> >> >>> # source=pages-articles.xml.gz
> >> >>>
> >> >>> You should comment / uncomments the ones that suit you
> >> >>>
> >> >>> Best,
> >> >>> Dimitris
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Sun, Apr 21, 2013 at 2:24 AM, Ahmed Ktob <[email protected]>
> wrote:
> >> >>>>
> >> >>>> Hello guys,
> >> >>>>
> >> >>>> Today I was trying to use the extraction framework to extract data
> >> >>>> for the
> >> >>>> Arabic language. When it comes to finding the file in the download
> >> >>>> directory
> >> >>>> (dump file), it didn't work, so after a while I figured that a part
> >> >>>> of code
> >> >>>> from the file Import.scala is written as follow :
> >> >>>>
> >> >>>> try {
> >> >>>> for (language <- languages) {
> >> >>>>
> >> >>>> val finder = new Finder[File](baseDir, language, "wiki")
> >> >>>> val tagFile = if (requireComplete) Download.Complete else
> >> >>>> "pages-articles.xml"
> >> >>>> val date = finder.dates(tagFile).last
> >> >>>>   val file = finder.file(date, "pages-articles.xml")
> >> >>>>
> >> >>>> I tried to change the name to "pages-articales.xml.bz2" and the
> >> >>>> extraction
> >> >>>> successfully passed this point.
> >> >>>>
> >> >>>> My point is, don't you think that we should make the changes I
> >> >>>> mentioned
> >> >>>> above ? Because when we download the dump file, it comes with
> ".bz2"
> >> >>>> in the
> >> >>>> name.
> >> >>>>
> >> >>>> Best regards,
> >> >>>> Ahmed.
> >> >>>> --
> >> >>>> ------------------------------------------------
> >> >>>> Ahmed Ktob
> >> >>>> Dr. Taher Moulay University
> >> >>>> Department of Computer Science
> >> >>>> Saida , Algeria
> >> >>>> Tel : +213 554 811 151
> >> >>>> ------------------------------------------------
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> ------------------------------------------------------------------------------
> >> >>>> Precog is a next-generation analytics platform capable of advanced
> >> >>>> analytics on semi-structured data. The platform includes APIs for
> >> >>>> building
> >> >>>> apps and a phenomenal toolset for data science. Developers can use
> >> >>>> our toolset for easy data analysis & visualization. Get a free
> >> >>>> account!
> >> >>>> http://www2.precog.com/precogplatform/slashdotnewsletter
> >> >>>> _______________________________________________
> >> >>>> Dbpedia-discussion mailing list
> >> >>>> [email protected]
> >> >>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Kontokostas Dimitris
> >> >>>
> >> >>>
> >> >>>
> ------------------------------------------------------------------------------
> >> >>> Precog is a next-generation analytics platform capable of advanced
> >> >>> analytics on semi-structured data. The platform includes APIs for
> >> >>> building
> >> >>> apps and a phenomenal toolset for data science. Developers can use
> >> >>> our toolset for easy data analysis & visualization. Get a free
> >> >>> account!
> >> >>> http://www2.precog.com/precogplatform/slashdotnewsletter
> >> >>> _______________________________________________
> >> >>> Dbpedia-discussion mailing list
> >> >>> [email protected]
> >> >>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >> >>>
> >
> >
> >
> >
> > --
> > ------------------------------------------------
> > Ahmed Ktob
> > Dr. Taher Moulay University
> > Department of Computer Science
> > Saida , Algeria
> > Tel : +213 554 811 151
> > ------------------------------------------------
>



-- 
*------------------------------------------------
**Ahmed Ktob
Dr. Taher Moulay* *University * *
Department of Computer Science*
*Saida , Algeria*
*Tel : +213 554 811 151**
------------------------------------------------*

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter

_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Re: [Dbpedia-discussion] .bz2 problem

Reply via email to