Hello all ,
Hoping that everyone is enjoying the summer ,
I've written a scala
script<https://github.com/hadyelsahar/extraction-framework/blob/lang-link-extract/scripts/src/main/scala/org/dbpedia/extraction/scripts/LanguageSpecificLinksGenerator.scala>to
do the task to generate LLlinks specific files to be uploaded as
mentioned by JC here
<http://www.mail-archive.com/[email protected]/msg00148.html>
option 0 in the script is for extracting the master LL file
option 1 is for extracting language specific links files
the first iteration of the code is of complexity O(n^2) , where n is the
lines in the master LL file ,it seems so Dumb and would take a lot of time
when running it on the big dumb, there's a lot of easy ways to optimize
this but i had some questions :
1- could we depend that the triples RDF dump will be in order ? ie.(for
example all Q1000 entity triples will come after each other and we don't
need to parse the rest of the file for related triples )
2- in that task which is better to optimize , memory vs time ?, loading
file in a HashMap will optimize the speed a lot , but it may take some
memory.
3-just for the sake of curiosity and setting standards , the Language links
extraction process in wikipedia , how much does it take in terms of time
and do we dedicate special server for that ? or it doesn't need it's just a
small process ?
4- any suggestions could be great
thanks
Regards
-------------------------------------------------
Hady El-Sahar
Research Assistant
Center of Informatics Sciences | Nile University<http://nileuniversity.edu.eg/>
email : [email protected]
Phone : +2-01220887311
http://hadyelsahar.me/
<http://www.linkedin.com/in/hadyelsahar>
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers