Tim Chase wrote:
grep 'rs10946498' chr6.txt | grep -v 'rs10946498.*rs10946498' >
out.txt
Sed might allow it in one pass with something like
sed -e '/rs10946398/!d' -e '/rs10946398.*rs10946398/d'
chr6.txt > out.txt
Still try to migrate from Windows to linux, but hopefully will done it
someday!
Since you have a fixed pattern (as Tony mentioned about using
fgrep), you can do at least the first variant in native
windows/dos with
C:\Temp> find "rs10946398" chr6.txt > out.txt
without the need for sed/grep at all. The dos "find" command is
a bit like "grep" with all the cool functionality removed. The
resulting file would hopefully be small enough that vim/ed could
handle the resulting out.txt file.
You can learn more by issuing
C:\Temp> find /?
No patterns other than fixed text, but sometimes that's all you
need. And 640k oughta be enough for anyone ;)
-tim
(now only running Windows at work, but Linux, OpenBSD and Mac OS
X at home)
I never thought of using find under cmd. I'm not very into computers but
as long as I remember cmd (and DOS) generally can handle data of 640K at
one batch meaning that it would need many hours (even days) to execute
the above command for a 3.5GB file. Anyway a test is always better than
a hypothesis, so I started a cmd prompt and run the code. I did this as
soon as I got your e-mail. It now have passed 5-6 mins and still no
result (I monitor the out.txt filesize as well). sed finished iin about
70secs, so probably cmd will take alot of hours.
Nikos