Pareshkumar Panchal wrote:

> I am trying to merge two csv files with following condition.
> filea.csv column( a1,a2,a3)
> fileb.csv column( b1,b2,b3)
> i wanted to merge if a1 & b1 reference (as the title is different but
> contents are valid for comparison). 

If the column title is the problem you can rename it.

> also the result i need is
> a1,a2,a3,b1,b2 (not b3).

Then delete that column either before or after the merge.
> i am using openpyxl,csv modules at this moment.
> any help appreciated.

Here's a demo using pandas:

>>> import pandas as pd

Read the original files:

>>> left = pd.read_csv("left.csv")
>>> left
  alpha   beta  gamma
0   foo    one    two
1   bar  three   four
2   baz   five    six
3   ham  seven  eight

[4 rows x 3 columns]
>>> right = pd.read_csv("right.csv")
>>> right
  delta epsilon   zeta
0   foo     the  quick
1   bar   brown    fox
2   baz   jumps   over
3  spam     the   lazy

[4 rows x 3 columns]

Merge by alpha/delta column:

>>> left.merge(right, left_on="alpha", right_on="delta", how="outer")
  alpha   beta  gamma delta epsilon   zeta
0   foo    one    two   foo     the  quick
1   bar  three   four   bar   brown    fox
2   baz   five    six   baz   jumps   over
3   ham  seven  eight   NaN     NaN    NaN
4   NaN    NaN    NaN  spam     the   lazy

[5 rows x 6 columns]
>>> both = _
>>> import sys

Write to file (or stream as I did for demonstration purposes):

>>> both.to_csv(sys.stdout)

Delete a column you don't need (could have done that earlier):

>>> del both["zeta"]
>>> both.to_csv(sys.stdout)

Tutor maillist  -
To unsubscribe or change subscription options:

Reply via email to