Hello everyone! I am Nilesh Chakraborty, a senior year B.Tech undergraduate from India, majoring in Computer Science and Engineering. This year I'll be working on the DBpedia GSoC project, "Distributed extraction of Wikipedia data dumps for DBpedia" with my mentors - Sang, Nicolas, Dimitris and Andrea.
The project basically aims to parallelize the download of Wikipedia data dumps and run parallel distributed extractions of the downloaded dumps using Apache Spark (or even gracefully degrade to Scala multiprocessing on a single node if configured so). I have been working on a proof-of-concept while writing my proposal, and shared some details about it on the dbpedia-gsoc mailing list [1]. I have a lot of code-related issues to discuss (features, what to include, code/class design discussions) along with daily workflow/process related stuff. For example - whether I should use GitHub Wikis to keep track of my weekly/bi-weekly progress reports (and blog in more details at nileshc.com/blog), which repository should I use, fork-pull vs branch-merge, mode of regular meetings (IRC?), availabilities etc. So let's decide upon a time that's convinient for all of us this week and setup our first meeting! :) My time zone is UTC+5:30. 1:30pm UTC to 6:30pm UTC this week is okay for me. Sang, Dimitris, Andrea, Nicolas - please let me know some time ranges and dates when you can be available for the meeting. We can use IRC if that's okay? I have a flexible schedule and can adjust my hours according to my mentors' availabilities. The first three weeks of May will be a bit more packed for me than usual, with the end-semester exams and project deadlines on, and after that till the rest of the summer I'm pretty much free. Cheers, Nilesh [1] : https://www.mail-archive.com/[email protected]/msg00486.html -- A quest eternal, a life so small! So don't just play the guitar, build one. You can also email me at [email protected] or visit my website<http://www.nileshc.com/> ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Dbpedia-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
