Re: [Chennaipy] Finding duplicates of row in csv

Rajagopal Jagannathan Mon, 12 Feb 2018 05:20:14 -0800

You can do this in multiple ways.

in Shell script


sort <filename> | uniq -c

If you have to use Python, then the faster way to do it would be to load
the csv into a Pandas dataframe, which should allow you to use
dataframe.duplicated()

If you don't want to use pandas then you can loop through the csv and
create a set or hashmap with the row as the key and count as the value

Hope this helps.

On Sun, Feb 11, 2018 at 9:07 AM, Saravanan Muthu <saravana4...@gmail.com>
wrote:

> Hello All,
>      I have a csv with multiple column , and I need to figure out the
> duplicates entry ,I have imported csv and assigned the row to dictionary
> ,please share a logic to find the duplicates , sample data is below ,
>
> Name age employer
> Kumar 28 133678
> Kumar 28 133678
> Anil. 42.   133567
>
> Kumar entry need to be finded out
>
> _______________________________________________
> Chennaipy mailing list
> Chennaipy@python.org
> https://mail.python.org/mailman/listinfo/chennaipy
>
>

_______________________________________________
Chennaipy mailing list
Chennaipy@python.org
https://mail.python.org/mailman/listinfo/chennaipy

Re: [Chennaipy] Finding duplicates of row in csv

Reply via email to