I am in the process of learning this as well (BlueDragon/Lucene), so I can help you get started. It was a hurdle for me too, so hopefully everyone can benefit from what I have learned from this. CF is basically the same methods, only it uses Verity.
You can do a query index that gives you the exact same results as an HTML spider does if you want. Even better really, since you can control what is actually indexed. Instead of every item on the page being indexed, you can just grab the content you need. We used the HTML spider-- and did not like the results so we create our indexes dynamically locally, then upload the index. The main hurdle is getting the query to provide you with exactly the information you need-- since you are limited to 1 query field in the BODY of the cfindex tag, you do need to do some fancy footwork to get that field to contain all the information you need. What I do when I need to index multiple fields is to create a query that contains all the fields from the products tables combined into 1 query grouped by the SKU and the attributes. Then I create a new query using the fields that cfindex needs. I then loop over my products query combining the fields that I need indexed into the fields I need. concat them with VAR & " " & VAR setting BODY in the dynamic query to all those fields. After you build the new query, the BODY will contain all the fields you need-- The only thing to watch out for is to make sure "KEY" is a unique variable so you only index each item once. You can also do grouped query's and nest the output so you can have a SHIRT with multiple SIZES etc... As along as it is combined into the BODY field in the dynamic query it will be added to the index. This is not complete.. but its the basics. 1 warning-- TEST LOCALLY!! and don't try to do a cfsearch in the same script as the one you are creating the collection. You can create a race condition in multi threaded server that can freeze resources and the JVM. Also, I find it fastest to delete the collection each time I test instead of updating it. You also should consider creating indexes locally then uploading them complete, this is pretty intensive on huge datasets. ---------- This is a good UDF to have too. http://www.cflib.org/udf/collectionExists //---------------CODE -----------------// indexQuery = QueryNew("KEY,URLPATH,TITLE,CATEGORY,BODY,CUSTOM1,CUSTOM2,CUSTOM3,CUSTOM4"); <cfoutput query="mydata"> <cfscript> QuerySetCell(indexQuery, "BODY", mydata.DESCRIPTION & " " & mydata.CATEGORY & " " & mydata.TYPE & " " & mydata.OTHER)); QuerySetCell(indexQuery, "KEY", mydata.SKU);// HAS TO BE UNIQUE TO THIS INDEXED row QuerySetCell(indexQuery, "URLPATH", mydata.URL); QuerySetCell(indexQuery, "TITLE", mydata.TITLE & " " & mydata.GARMENT & " " & mydata.COLOR); QuerySetCell(indexQuery, "CATEGORY", "RETAIL"); QuerySetCell(indexQuery, "CUSTOM1", mydata.TCODE); QuerySetCell(indexQuery, "CUSTOM2", mydata.PRICE); QuerySetCell(indexQuery, "CUSTOM3", mydata.GARMENT); QuerySetCell(indexQuery, "CUSTOM4", mydata.SIZE); </cfscript> </cfoutput> Then you can just pass the dynamic query to Index and it will use your combined BODY and custom fields.. <cfindex query="indexQuery" collection="#collection#" action="Update" type="Custom" urlpath="URLPATH" key="KEY" title="TITLE" custom1="CUSTOM1" custom2="CUSTOM2" custom3="CUSTOM3" custom4="CUSTOM4" body="BODY" category="RETAIL" /> //---------------E/O CODE -----------------// -- /Kevin Pepperman "They who can give up essential liberty to obtain a little temporary safety, deserve neither liberty nor safety." - Benjamin Franklin ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:332480 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

