here is a custom tag i wrote a while back to do just what you want i believe
if anyone wants to workaround the evalute()'s feel free and let me see : )
someone i know had a huge DB table full of emails with duplicates and this
is what i came up with to help... then i said set all email fiields to NO
DUPLICATES when you make the table : )
<!--- {REQUIRED) The data source to use in the queries --->
<CFPARAM NAME="ATTRIBUTES.DataSource" DEFAULT="" TYPE="STRING">
<!--- {REQUIRED} The table name to query for duplicates --->
<CFPARAM NAME="ATTRIBUTES.TableName" DEFAULT="" TYPE="STRING">
<!--- {REQUIRED} The primary key of the table defined in
ATTRIBUTES.tbl_name" --->
<CFPARAM NAME="ATTRIBUTES.PrimaryKey" DEFAULT="" TYPE="STRING">
<!--- {REQUIRED} The name of the column to check duplicates for --->
<CFPARAM NAME="ATTRIBUTES.Column" DEFAULT="" TYPE="STRING">
<!--- {OPTIONAL} To NOT display output to the screen (OF DUPLICATES ONLY)
set this to NO --->
<CFPARAM NAME="ATTRIBUTES.DisplayOutput" DEFAULT="Yes" TYPE="STRING">
<!--- {OPTIONAL} Set Mode to "ViewOnly" to see the duplicates and
information that will be deleted Without Deleting anything --->
<CFPARAM NAME="ATTRIBUTES.Mode" DEFAULT="Process" TYPE="STRING">
<!---***********************************************// ERROR CHECKS
//*******************************************************--->
<!--- If "ATTRIBUTES.Column is undefined, inform the user that it is
required --->
<cfif ATTRIBUTES.Column EQ "">
<cfthrow message="The attribute ''<b>Column</b>'' is required. It is the
column you want to check duplicates for.">
</cfif>
<!--- If "ATTRIBUTES.DisplayOutput is not equal to Yes or No, inform
the user that these are the only acceptable values --->
<cfif ATTRIBUTES.DisplayOutput NEQ "YES" AND ATTRIBUTES.DisplayOutput NEQ
"No">
<cfthrow message="''Yes'' and ''No'' are the only acceptable values for
''<b>DisplayOutput</b>''. Default is Yes">
</cfif>
<!--- If ATTRIBUTES.Mode is not equal to "Process" or "ViewOnly", inform
the user that these are the only acceptable values --->
<cfif ATTRIBUTES.Mode NEQ "Process" AND ATTRIBUTES.Mode NEQ "ViewOnly">
<cfthrow message="''Process'' and ''ViewOnly'' are the only acceptable
values for ''<b>Mode</b>''. Default is Process">
</cfif>
<!--- If ATTRIBUTES.DataSource was not defined, inform the user --->
<cfif ATTRIBUTES.DataSource IS "">
<cfthrow message="The attribute, ''<b>DataSource</b>'' is required and
must be a valid datasource.">
</cfif>
<!--- If ATTRIBUTES.TableName was not defined, inform the user --->
<cfif ATTRIBUTES.TableName IS "">
<cfthrow message="The attribute, ''<b>TableName</b>'' is required and
must be a valid table name in the datasource defined in
''<b>DataSource</b>''.">
</cfif>
<!--- If ATTRIBUTES.PrimaryKey was not defined, inform the user --->
<cfif ATTRIBUTES.PrimaryKey IS "">
<cfthrow message="The attribute, ''<b>PrimaryKey</b>'' is required and
must be a the tables Unique Identifier.">
</cfif>
<!--- If ATTRIBUTES.Mode is ViewOnly but ATTRIBUTES.DisplayOutput is No,
Inform the user that this accomplishes nothing --->
<cfif ATTRIBUTES.Mode IS "ViewOnly" AND ATTRIBUTES.DisplayOutput NEQ
"YES">
<cfthrow message="Displaying <b>No</b> output in ''<b>ViewOnly</b>'' mode
will not accomplish anything.">
</cfif>
<!---*******************************************// END ERROR CHECKS
//******************************************************--->
<!--- The Final list of Unique identifiers that will be deleted from the
database will be stored in "FinalList"--->
<CFPARAM NAME="FinalList" DEFAULT="">
<!--- A temp list to hold the UID's of the current item's duplicates --->
<CFPARAM NAME="TempList" DEFAULT="">
<!--- Get ALL specified fields and Primary keys to loop over, checking for
duplicates --->
<cfquery name="GetAll" datasource="#ATTRIBUTES.DataSource#"
blockfactor="100">
SELECT #ATTRIBUTES.PrimaryKey#, #ATTRIBUTES.Column#
FROM #ATTRIBUTES.TableName#
ORDER by #ATTRIBUTES.Column#
</cfquery>
<!--- Select UNIQUE Entries to compare against the first query to get a
count of extras (This is used for display only)--->
<cfquery name="GetDistinct" datasource="#ATTRIBUTES.DataSource#"
blockfactor="50">
SELECT DISTINCT #ATTRIBUTES.Column#
FROM #ATTRIBUTES.TableName#
ORDER BY #ATTRIBUTES.Column#
</cfquery>
<!--- If displayOutput is YES output a list of duplicates (if any) --->
<cfif ATTRIBUTES.DisplayOutput IS "YES">
<!--- Extra ID's will be the first queries recordcount minus the second
queries recordcount --->
<cfset extras = GetAll.RecordCount - GetDistinct.RecordCount>
<cfoutput>
Before Purge Of Duplicates:<br>
------------------------------------------------------<br>
<b>#GetAll.RecordCount#</b> entries in the database.
<b>#GetDistinct.RecordCount#</b> are unique <i>(<b>#extras#</b>
extras)</i><br>
------------------------------------------------------<br><br>
<cfif GetAll.RecordCount EQ GetDistinct.RecordCount>
<b>There were no duplicates in the database. No data will be deleted.</b>
</cfif>
</cfoutput>
</cfif>
<!--- loop through grouping by the field chosen to check duplicates for --->
<cfoutput query="GetAll" group="#ATTRIBUTES.Column#">
<!--- get duplicate emails --->
<cfquery name="GetDups" datasource="#ATTRIBUTES.DataSource#"
blockfactor="50">
SELECT #ATTRIBUTES.PrimaryKey#, #ATTRIBUTES.Column#
FROM #ATTRIBUTES.TableName#
WHERE #ATTRIBUTES.Column# = <cfif NOT
IsNumeric(ATTRIBUTES.Column)>'</cfif>#evaluate(ATTRIBUTES.COLUMN)#<cfif NOT
IsNumeric(ATTRIBUTES.Column)>'</cfif>
</cfquery>
<!--- if the Information is entered more than once --->
<cfif GetDups.RecordCount GT 1>
<!--- Set all of the Uniquie identifiers, that are associated with the
current return, to a temporary list --->
<cfset TempList = Evaluate("ValueList(GetDups.#Attributes.PrimaryKey#)")>
<cfif ATTRIBUTES.DisplayOutput IS "YES">
<b>#Evaluate(ATTRIBUTES.Column)#</b> is in the database
<b>#getdups.recordcount#</b> times<br><br>
</cfif>
<!--- Remove ONE of the current Itmes PrimaryKeys from the list, so we
keep ONE of their entries in the database when we loop over the list to
delete --->
<cfset TempList = ListDeleteAt(TempList, 1, ",")>
</cfif>
<!--- Append the temp list we made during THIS loop to the larger
collection of Primary Keys if it was not empty--->
<cfif templist NEQ "">
<cfset FinalList = listappend(FinalList, TempList, ",")>
<!--- reset the temp list so we can start over --->
<cfset templist="">
</cfif>
</cfoutput>
<cfif ATTRIBUTES.DisplayOutput IS "YES" AND len(FinalList)>
<cfoutput>
<br>
<b>The Following list is a list of Primary Keys that will need to be
deleted to rid the table of duplicate entries</b>
<br><br>
#FinalList#
</cfoutput>
</cfif>
<!--- If ATTRIBUTES.Mode is Process, then delete all entries with A primary
key that is found in our final list --->
<cfif ATTRIBUTES.Mode IS "Process">
<!--- I did not use "IN(List)" so each delete query could be seen if debug
output is on --->
<cfloop list="#FinalList#" index="i">
<cfquery name="DeleteEntries" datasource="#ATTRIBUTES.DataSource#">
DELETE
FROM #ATTRIBUTES.TableName#
WHERE #ATTRIBUTES.PrimaryKey# = <cfif NOT IsNumeric(I)>'</cfif>#i#<cfif
NOT IsNumeric(I)>'</cfif>
</cfquery>
</cfloop>
</cfif>
"Paul Hastings" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> > Hi list, with the following script (below) I am
> > checking duplicate eMails and I delete
> > them after checking them.
>
> you don't say what db BUT if its sql server (or another db with similar
> index functionality), i'd just let the db do the hard work:
> create another table duplicating your original table but put a unique
index
> with ignore duplicate key property on your email column.
>
> CREATE UNIQUE INDEX emailIDX
> ON blabla(email)
> WITH IGNORE_DUP_KEY
>
> then simply INSERT/SELECT from the old table into the new one. sql server
> will accept the first email, but throw away (ignore) all the duplicate
ones.
>
>
>
>
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4
Subscription:
http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Your ad could be here. Monies from ads go to support these lists and provide more
resources for the community.
http://www.fusionauthority.com/ads.cfm
Unsubscribe:
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4