Re: [Tutor] csv merge with different column title

2018-04-14 Thread Peter Otten
Pareshkumar Panchal wrote:

> I am trying to merge two csv files with following condition.
> 
> filea.csv column( a1,a2,a3)
> fileb.csv column( b1,b2,b3)
> 
> i wanted to merge if a1 & b1 reference (as the title is different but
> contents are valid for comparison). 

If the column title is the problem you can rename it.

> also the result i need is
> a1,a2,a3,b1,b2 (not b3).

Then delete that column either before or after the merge.
 
> i am using openpyxl,csv modules at this moment.
> 
> any help appreciated.

Here's a demo using pandas:

>>> import pandas as pd

Read the original files:

>>> left = pd.read_csv("left.csv")
>>> left
  alpha   beta  gamma
0   fooonetwo
1   bar  three   four
2   baz   fivesix
3   ham  seven  eight

[4 rows x 3 columns]
>>> right = pd.read_csv("right.csv")
>>> right
  delta epsilon   zeta
0   foo the  quick
1   bar   brownfox
2   baz   jumps   over
3  spam the   lazy

[4 rows x 3 columns]

Merge by alpha/delta column:

>>> left.merge(right, left_on="alpha", right_on="delta", how="outer")
  alpha   beta  gamma delta epsilon   zeta
0   fooonetwo   foo the  quick
1   bar  three   four   bar   brownfox
2   baz   fivesix   baz   jumps   over
3   ham  seven  eight   NaN NaNNaN
4   NaNNaNNaN  spam the   lazy

[5 rows x 6 columns]
>>> both = _
>>> import sys

Write to file (or stream as I did for demonstration purposes):

>>> both.to_csv(sys.stdout)
,alpha,beta,gamma,delta,epsilon,zeta
0,foo,one,two,foo,the,quick
1,bar,three,four,bar,brown,fox
2,baz,five,six,baz,jumps,over
3,ham,seven,eight,,,
4spam,the,lazy

Delete a column you don't need (could have done that earlier):

>>> del both["zeta"]
>>> both.to_csv(sys.stdout)
,alpha,beta,gamma,delta,epsilon
0,foo,one,two,foo,the
1,bar,three,four,bar,brown
2,baz,five,six,baz,jumps
3,ham,seven,eight,,
4spam,the


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] csv merge with different column title

2018-04-14 Thread Alan Gauld via Tutor
On 14/04/18 04:32, Pareshkumar Panchal wrote:

> I am trying to merge two csv files with following condition.
> 
> filea.csv column( a1,a2,a3)
> fileb.csv column( b1,b2,b3)
> 
> i wanted to merge if a1 & b1 reference (as the title is different but
> contents are valid for comparison). also the result i need is
> a1,a2,a3,b1,b2 (not b3).

This doesn't have much to do with the data being CSV but
applies to merging any data structures. Lets start by
assuming the data is small enough to read both filers
into memory as two lists. (If not the same principles
apply but you have to do it from disk which will be
slower)

First read your two lists A and B.
Sort both lists based on the first field.
Start at the top of list A and search for a corresponding item in B.
step through B processing each item until the field no longer matches
step onto the next item in A and repeat the above process

When you reach the end of either list you are done.

There are some other (cleverer) ways to do this but that
simple approach is how batch data processing has worked
for the last 60 years. It's simple to write and debug.

Try it, if it doesn't work for you show us what you've
tried and any errors.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] csv merge with different column title

2018-04-14 Thread Pareshkumar Panchal
Hi,

I am trying to merge two csv files with following condition.

filea.csv column( a1,a2,a3)
fileb.csv column( b1,b2,b3)

i wanted to merge if a1 & b1 reference (as the title is different but
contents are valid for comparison). also the result i need is
a1,a2,a3,b1,b2 (not b3).

i am using openpyxl,csv modules at this moment.

any help appreciated.

Thank you,
P
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor