Hi all, Just looking for some general guidance on how I would approach this task.
If I have two datasets containing items, what is currently the best way to detect duplicates between them using Mahout? I intend on matching based on item name text similarity to begin with. I'm willing to write Java wherever necessary, but I just want to be sure to avoid "re-coding the wheel" as such. Cheers, -dcf
