Am 09.12.2011 22:53, schrieb Roland Häder:
On Fri, 2011-12-09 at 18:29 +0100, ThorstenB wrote:

Here's a full list of sound duplicates:
http://pastebin.com/DvCT6AT9
Can you please release this file? Or is it possible to sent it to me
directly? I would like to cleanup some of my archives.

Find the script attached. The script does both - reports stereo files and all duplicates. You need to run it in the directory you want to search. Requires python. Is Linux only. Don't ask for documentation. Don't use the bug tracker if it doesn't work. :)

cheers,
Thorsten
#!/usr/bin/python

import os
import string

def doShell(Command):
    Pipe = os.popen(Command)
    Lines = Pipe.readlines()
    Lines = map(string.strip, Lines)
    Pipe.close()
    return Lines

def getSoundChannels(x):
    Lines = doShell("soxi -c \""+x+"\"")
    if len(Lines)<1:
        return '1'
    return string.strip(Lines[0])

def getMD5(x):
    Lines = doShell("md5sum \""+x+"\"")
    if len(Lines)<1:
        return ''
    return string.strip(Lines[0]).split()[0]

def getFileList(Pattern):
    Lines = doShell("find -iname \""+Pattern+"\"")
    return Lines

def checkStereoFiles(FileList):
    BadFiles = []
    for FileName in FileList:
        Channels = getSoundChannels(FileName)
        if (Channels!='1'):
            BadFiles.append((FileName,Channels))

    BadFiles.sort()
    BadAircraft = {}
    print "Stereo files:",len(BadFiles),"/",len(FileList)
    for (FileName,Channels) in BadFiles:
        print FileName
        Aircraft = FileName.split("/")[2]
        BadAircraft[Aircraft]=BadAircraft.get(Aircraft,0)+1

    AircraftList=BadAircraft.keys()
    AircraftList.sort()
    print "Aircraft with stereo files:",len(AircraftList)
    print string.join(AircraftList,"\n")

def findIdenticalFiles(FileList):
    Md5Dict = {}
    for FileName in FileList:
        Checksum = getMD5(FileName)
        IdenticalFiles = Md5Dict.get(Checksum,[])
        IdenticalFiles.append(FileName)
        Md5Dict[Checksum] = IdenticalFiles

    ChecksumList = Md5Dict.keys()
    SortList=[]
    for Checksum in ChecksumList:
        SortList.append((len(Md5Dict[Checksum]),Checksum))
    SortList.sort()
    SortList.reverse()
    for (Count,Checksum) in SortList:
        FileNameList = Md5Dict[Checksum]
        if (len(FileNameList)>1):
			print "%i identical files, MD5: %s\n    %s" % (len(FileNameList),Checksum,string.join(FileNameList,"\n    "))

FileList = getFileList("*.wav")
print "Found",len(FileList),"files."
checkStereoFiles(FileList)
findIdenticalFiles(FileList)

------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel

Reply via email to