Re: [Wikitech-l] Parallel computing project

2010-10-27 Thread James Salsman
Aryeh Gregor writes: > > To clarify, the subject needs to 1) be reasonably doable in a short > timeframe, 2) not build on top of something that's already too > optimized Integrating a subset of RTMP (e.g. the http://code.google.com/p/rtmplite subset) into the chunk-based file upload API -- htt

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Ariel T. Glenn
Στις 27-10-2010, ημέρα Τετ, και ώρα 00:05 +0200, ο/η Ángel González έγραψε: > Ariel T. Glenn wrote: > > If one were clever (and I have some code that would enable one to be > > clever), one could seek to some point in the (bzip2-compressed) file and > > uncompress from there before processing. Run

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Ángel González
Ariel T. Glenn wrote: > If one were clever (and I have some code that would enable one to be > clever), one could seek to some point in the (bzip2-compressed) file and > uncompress from there before processing. Running a bunch of jobs each > decompressing only their small piece then becomes feasib

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Robert Rohde
On Tue, Oct 26, 2010 at 8:25 AM, Ariel T. Glenn wrote: > Στις 26-10-2010, ημέρα Τρι, και ώρα 16:25 +0200, ο/η Platonides έγραψε: >> Robert Rohde wrote: >> > Many of the things done for the statistical analysis of database dumps >> > should be suitable for parallelization (e.g. break the dump into

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Tim Starling
On 24/10/10 17:42, Aryeh Gregor wrote: > This term I'm taking a course in high-performance computing > , and I have > to pick a topic for a final project. According to the assignment >

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Tisza Gergő
Aryeh Gregor gmail.com> writes: > To clarify, the subject needs to 1) be reasonably doable in a short > timeframe, 2) not build on top of something that's already too > optimized. It should probably either be a new project; or an effort > to parallelize something that already exists, isn't paral

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Ariel T. Glenn
Στις 26-10-2010, ημέρα Τρι, και ώρα 16:25 +0200, ο/η Platonides έγραψε: > Robert Rohde wrote: > > Many of the things done for the statistical analysis of database dumps > > should be suitable for parallelization (e.g. break the dump into > > chunks, process the chunks in parallel and sum the result

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Jyothis Edathoot
Develop a new bot framework (may be interwiki processing to start with) for high performance GPU cluster (nvidia or AMD) similar to what boinc based projects does. nvdia is more popular while AMD has more cores for the same price :) Regards, Jyothis. http://www.Jyothis.net http://ml.wikipedi

Re: [Wikitech-l] Parallel computing project

2010-10-26 Thread Platonides
Robert Rohde wrote: > Many of the things done for the statistical analysis of database dumps > should be suitable for parallelization (e.g. break the dump into > chunks, process the chunks in parallel and sum the results). You > could talk to Erik Zachte. I don't know if his code has already been

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Robert Rohde
Many of the things done for the statistical analysis of database dumps should be suitable for parallelization (e.g. break the dump into chunks, process the chunks in parallel and sum the results). You could talk to Erik Zachte. I don't know if his code has already been designed for parallel proce

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread George Herbert
On Mon, Oct 25, 2010 at 4:24 PM, Aryeh Gregor wrote: > On Mon, Oct 25, 2010 at 7:15 PM, George Herbert > wrote: >> (suppressing grumbles about diff engine for edit conflicts, probably >> an algorithm rather than speed issue) > > Diffs are fast enough if you use wikidiff2, no? Yeah; what I really

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Aryeh Gregor
On Mon, Oct 25, 2010 at 7:15 PM, George Herbert wrote: > (suppressing grumbles about diff engine for edit conflicts, probably > an algorithm rather than speed issue) Diffs are fast enough if you use wikidiff2, no? > I suspect that the MySQL engine is the one place where parallelism is > most app

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread George Herbert
On Mon, Oct 25, 2010 at 4:04 PM, Aryeh Gregor wrote: > On Mon, Oct 25, 2010 at 6:09 PM, Ariel T. Glenn wrote: >> I am following this discussion and happy to bat around ideas if there's >> something that might be appropriate for your course, Aryeh. > > Well, is there?  Anything that needs to be pa

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Aryeh Gregor
On Mon, Oct 25, 2010 at 6:09 PM, Ariel T. Glenn wrote: > I am following this discussion and happy to bat around ideas if there's > something that might be appropriate for your course, Aryeh. Well, is there? Anything that needs to be parallelized, or that's already parallelized but needs signific

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Ariel T. Glenn
Στις 25-10-2010, ημέρα Δευ, και ώρα 22:47 +0200, ο/η Platonides έγραψε: > Make the best dump compressor ever? :) > > The page http://www.mediawiki.org/wiki/Dbzip2 is worth looking at just > for the available options. Continuing dbzip2 is the first idea but not > the only one. I'm sure many things

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Aryeh Gregor
On Mon, Oct 25, 2010 at 12:38 PM, Paul Houle wrote: >     I want Wikipedia converted into facts in a representation system > that supports modal,  temporal,  and "microtheory" reasoning.  You > know,  in the "real" world,  :James_T_Kirk is a :Fictional_Character, > but in the Star Trek universe,  

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Platonides
Make the best dump compressor ever? :) The page http://www.mediawiki.org/wiki/Dbzip2 is worth looking at just for the available options. Continuing dbzip2 is the first idea but not the only one. I'm sure many things can be dig from there. Also worth noting, Ariel has been doing the last en dumps i

Re: [Wikitech-l] Parallel computing project

2010-10-25 Thread Paul Houle
On 10/24/2010 8:42 PM, Aryeh Gregor wrote: > My first thought was to write a GPU program to crack MediaWiki > password hashes as quickly as possible, then use what we've studied in > class about GPU architecture to design a hash function that would be > as slow as possible to crack on a GPU relat