Hi all,

Just looking for some general guidance on how I would approach this task.

If I have two datasets containing items, what is currently the best way to
detect duplicates between them using Mahout? I intend on matching based on
item name text similarity to begin with.

I'm willing to write Java wherever necessary, but I just want to be sure to
avoid "re-coding the wheel" as such.

Cheers,
-dcf

Reply via email to