[gentoo-user] join two tab-separate-value files without join field

Zhang Weiwu Fri, 23 May 2008 21:19:13 -0700

Hi.

I got a datasheet from my colleague in MS Excel format and I intend to
process that file with my awk/sed knowledge. The problem is: he sent me
two Excel files each with 2134 records, in fact there should be only one
excel file with 2134 rows and 295 columns, but MS Excel can only handle
256 data columns, so he split the datasheet vertically so he can manage
to send to me.


Now I saved both file to tab-separated-value format, how do I join them?

I could have used join(1) but that require a join field, an ID of some
sort. I think of this:

$  grep -n '' left.tsv | sed 's/:/\t/'> left.forjoin
$ grep -n '' right.tsv | sed 's/:/\t/'> right.forjoin
$ join -t "    " left.forjoin right.forjoin > result.tsv
(note that for join's -t parameter somehow I need to manage to get a tab
between the quotes)

Yes I achieved what I want, but that looks complex. Is there a simpler
way? Thanks in advance.

I know OpenOffice 3.0 can handle up to 1024 data columns. It's difficult
to convince anyone to switch to OOO because here in China MS Office
costs only 0$. I also could use OOO3.0 for doing the join but I wish to
know the commandline way:)

-- 

Real Softservice

Huateng Tower, Unit 1788
Jia 302 3rd area of Jinsong, Chao Yang

Tel: +86 (10) 8773 0650 ext 603
Mobile: 135 9950 2413
http://www.realss.com

-- 
[email protected] mailing list

[gentoo-user] join two tab-separate-value files without join field

Reply via email to