Re: Searching a drive and copying files
Many of the images will have the same name. I somehow managed to copy the same files several times when trying to do backup and restores. At this point in time they are going from my Apple laptop to my FreeBSD server. I am going to start looking for a inexpensive Tape drive to back up my data. I have been using iPhoto to manage my images. Sincerely, Joshua Lewis [EMAIL PROTECTED] On Jul 23, 2006, at 3:41 AM, Michael Hughes wrote: Joshua, On the dups, will the names of the files be the same or different? Do you have plans on how you will be storing the images after you get rid of the dups? Have the images be edited and if they have, did you edit them with a EXIF aware program? I use a program called epinfo to rename my images, it is part of the photopc utility. I have just a little over 10,000 digital images and store them by year, month, day and time. epinfo uses the EXIF data to rename the files and set the time stamps for the files. I have written some php programs to allow me to display the images thru a web browser. It uses a MySQL database to manage the images into categories. I also store a checksum of the picture in the database so I can check to see if the images have become damaged. I have a script that I wrote check the check sum in the database to the image. I do backups whenever I add new images to the hard drive. This is still a work in progress. If you can send me a little more data on your files and how you want to store the images, I could help you in you task. On Sat, 22 Jul 2006 10:47:13 -0400 Joshua Lewis [EMAIL PROTECTED] wrote: Hello List, I have a two part question for anyone who may be able to help. I need to search my drive for all pictures on my system and copy them to a networked system using sftp or ssh or what not. There will be duplicate names on the drive so I was hoping to have dups placed in a separate folder. Due to my for lack of a better term stupidity when I first got my camera I will probably have instances when there will be three or four duplicates. If anyone can help me out with that it would be great. Second is there a resource online I can use to learn how to do my own shell scripting? My goal is to find all my pictures and compare them then delete the dups that don't look that good. A daunting task as I have 20 GB of data. I bet 10 GB are dups. Thanks for any help. Sincerely, Joshua Lewis [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] -- Michael Hughes Log Home living is the best [EMAIL PROTECTED] Temperatures: Outside: 60.6 House: 70.9 Computer room: 69.5 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Searching a drive and copying files
in message [EMAIL PROTECTED], wrote Joshua Lewis thusly... I need to search my drive for all pictures on my system and copy them to a networked system using sftp or ssh or what not. There will be duplicate names on the drive so I was hoping to have dups placed in a separate folder. Unison, net/unison port, should be able to handle the duplicates based on file checksum. (I personally have not used it much, so i cannot answer any other queried about it; refer to its fine man page.) Due to my for lack of a better term stupidity when I first got my camera I will probably have instances when there will be three or four duplicates. can help me out with that it would be great. ... My goal is to find all my pictures and compare them then delete the dups that don't look that good. A daunting task as I have 20 GB of data. I bet 10 GB are dups. A checksum-based management of duplicates will help with the files with identical contents, but not with files that differ even a bit. Perl program below -- a modified version of Randal Schwartz's version[0] -- uses md5(1) to identify duplicates (as in identical files), failing that, Image::Magick based on fuzz factor. When it finds duplicates, it asks to enter the item number from the file list to be deleted. [0] Article Finding similar images, http://www.stonehenge.com/merlyn/LinuxMag/col50.html To be able to run, it needs Image::Magick (graphics/ImageMagick port), Cache::FileCache (devel/p5-Cache-Cache), List::Util (lang/p5-Scalar-List-Utils), File::Copy File::Path. Mind that it, rather Image::Magick, may consume all of your memory and/or temporary fs if you run it on all the files at once. If you are good in Perl, you could modify the program to move the duplicates in a directory (instead of deleting), and possibly not to ask to take the particular action (if as you say you would have a boat load of duplicates). Without further interruptions, program follows ... #!perl # This is a modified version of Randal Schwartz's ... # #http://www.stonehenge.com/merlyn/LinuxMag/col50.html # # ... as it uses checksum (MD5 for now) to detect identical files, failing that # uses Image::Magick. use warnings; use strict; $|++; use Image::Magick; use Cache::FileCache; use File::Copy qw( move ); use File::Path qw( mkpath ); use List::Util qw( reduce ); use Carp qw(carp); use Getopt::Long qw( :config gnu_compat no_ignore_case no_debug ); # User option; permitted average deviation in the vector elements. my $fuzz = 15; # User option; if defined, rename corrupt images into this dir. my $corrupt_dir = CORRUPT; { my $usage; GetOptions ( 'h|usage|help' = \$usage , 'f|fuzz=i' = \$fuzz , 'c|corrupt=s' = \$corrupt_dir , 'nc|nocorrupt' = sub { undef $corrupt_dir; } ) or usage( 1 ); usage( 0 ) if $usage; # Check if any arguments remain which will be file names usage( 1, No file(s) or directory(ies) given. ) unless scalar @ARGV; } sub warnif; my $cache = Cache::FileCache-new ( { namespace = 'image.cache' , cache_root = ( glob( ~/log/misc ) )[ 0 ] } ); my @buckets; FILE: while ( @ARGV ) { my $file = shift; next FILE if -l $file; if ( -d $file ) { opendir DIR, $file or next FILE; unshift @ARGV, map { m/^\./ ? () : $file/$_; } sort readdir DIR; next FILE; } next FILE unless -f _ or -d _; my ( @stat ) = stat _ or die should not happen: $!; # dev/ino/mtime my $key = @stat[ 0, 1, 9 ]; my @vector; #print $file ; if ( my $data = $cache-get( $key ) ) { #print ... is cached\n; @vector = @$data; } else { my $image = Image::Magick-new; if ( my $x = $image-Read( $file ) ) { if ( defined $corrupt_dir and $x =~ m/corrupt|unexpected end-of-file/i ) { print $file ; print ... renaming into $corrupt_dir\n; -d $corrupt_dir or mkpath $corrupt_dir, 0, 0700 or die Cannot mkpath $corrupt_dir: $!; move $file, $corrupt_dir or warn Cannot rename: $!; } else { print $file ; print ... skipping ( $x )\n; } next FILE; } #print is , join( x, $image-Get( 'width', 'height' ) ), \n; warnif $image-Normalize(); warnif $image-Resize( geometry = '4x4!' ); warnif $image-Set( magick = 'rgb' ); @vector = unpack C*, $image-ImageToBlob(); $cache-set( $key, [ @vector ] ); } BUCKET: for my $bucket ( @buckets ) { my $error = 0; INDEX: for my $index ( 0 .. $#vector ) { $error += abs( $bucket-[ 0 ][ $index ] - $vector[ $index ] ); next BUCKET if $error $fuzz * @vector; } push @$bucket, $file; #print linked , join( , , @$bucket[ 1 .. $#$bucket ] ), \n; next FILE; } push @buckets,
Searching a drive and copying files
Hello List, I have a two part question for anyone who may be able to help. I need to search my drive for all pictures on my system and copy them to a networked system using sftp or ssh or what not. There will be duplicate names on the drive so I was hoping to have dups placed in a separate folder. Due to my for lack of a better term stupidity when I first got my camera I will probably have instances when there will be three or four duplicates. If anyone can help me out with that it would be great. Second is there a resource online I can use to learn how to do my own shell scripting? My goal is to find all my pictures and compare them then delete the dups that don't look that good. A daunting task as I have 20 GB of data. I bet 10 GB are dups. Thanks for any help. Sincerely, Joshua Lewis [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Searching a drive and copying files
On Sat, 2006-07-22 at 10:47 -0400, Joshua Lewis wrote: Hello List, I have a two part question for anyone who may be able to help. I need to search my drive for all pictures on my system and copy them to a networked system using sftp or ssh or what not. There will be duplicate names on the drive so I was hoping to have dups placed in a separate folder. Due to my for lack of a better term stupidity when I first got my camera I will probably have instances when there will be three or four duplicates. If anyone can help me out with that it would be great. Second is there a resource online I can use to learn how to do my own shell scripting? My goal is to find all my pictures and compare them then delete the dups that don't look that good. A daunting task as I have 20 GB of data. I bet 10 GB are dups. Thanks for any help. Sincerely, Joshua Lewis [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] I have a perl script that does part of this, using MD5 hashes to identify duplicates. I posted it at http://ca.geocities.com/[EMAIL PROTECTED]/treeprune.pl Use at your own risk! ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]