Tom Metro wrote:
> TOPIC: A recap of the DFW Perl Mongers Deduplication Hackathon
> SPEAKERS: Joel Berger, Tommy Butler, Yanick Champoux, Bruce Gray, Tim King
> 
> How fast can Perl find duplicate files on a 100 GB file system? That's
> the question the Dallas/Fort Worth Perl Monger's Winter 2013
> Deduplication Hackathon set out to answer. There were a diverse range of
> contest entries, from the expected procedural Perl 5 code to ones that
> used Moose, MOP, and even one Perl 6 entry.

If you couldn't make it to the meeting, you can find most of the
materials online.

The original DFW.pm meeting recording is at:
http://www.youtube.com/watch?v=HvqqRFGMZS0

And you'll find Bruce Gray's Perl 6 entry discussed at time range 17:29
- 35:30. Joel Berger, who won for lowest memory, fewest lines of code,
and best Perl::Critic score, introduces himself at 10:54 - 12:48, and
then discusses his code at 1:06:10 - 1:21:01.

Tim, who won for best documentation, most features, and best effort,
discusses his code after Bruce, but I don't have the time index.

In a separate video, Yanick Champoux, the winner for fastest run time,
discusses his code:
http://www.youtube.com/watch?v=7MNuNupAvvg

Tommy Butler, the contest organizer, discussed his unofficial entry
towards the end of the first video above. (At the Boston.pm meeting he
elaborated on some of the optimizations he added post-contest, leading
to a design that beats Yanick by 40 seconds.)

The original contest specification can be fount at:
http://dfw.pm.org/hackathon.html

The results, including a leaderboard of winners, are at:
http://dfw.pm.org/winners.html

And finally, the code...

reference code          https://github.com/tommybutler/dupfind
Joel Berger             https://github.com/jberger/DeDuperizer
Tommy Butler            https://github.com/tommyprivate/dupfind
Yanick Champoux         https://github.com/yanick/dfw-contest
Tim King                https://github.com/JTimothyKing/Data-Dedup
Joakim Lagerqvist       https://github.com/jokke/Dedup-Hackaton
Reini Urban             https://github.com/rurban/App-finddups-bloom


 -Tom

-- 
Tom Metro
The Perl Shop, Newton, MA, USA
"Predictable On-demand Perl Consulting."
http://www.theperlshop.com/

_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to