On May 1, 5:32 pm, "Kam-Hung Soh" <[EMAIL PROTECTED]> wrote: > On Fri, 02 May 2008 09:13:35 +1000, John Bartley K7AAY > > > > <[EMAIL PROTECTED]> wrote: > > On May 1, 4:02 pm, [EMAIL PROTECTED] (Bob Proulx) wrote: > >> John Bartley K7AAY wrote: > >> > I need to print $1 lines from a file, and then delete that number of > >> > lines. > >> > $1 has been derived in the prior line with > >> > wc -l sourcefile.txt | awk '{$1 /= 4 ; $1 = int($1) ; print $1 }' > > >> That all looks okay. But I would probably personally do it all in the > >> shell. Try this: > > >> echo $(( $(wc -l < sourcefile.txt) / 4 )) > > >> > I've tried numerous awk and sed statements, a la: > > >> > sed -e -n "$1,p" sourcefile.txt > list.1 > >> > sed -i "$1d" sourcefile.txt > >> > sed $1q list.txt > list.1 & sed -i $1d sourcefile.txt > >> > awk "{(FNR < $1); print}" sourcefile.txt > list.1 > > >> Try this: > > >> l=$(( $(wc -l < sourcefile.txt) / 4 )) > >> sed --in-place "1,${l}d" sourcefile.txt > > >> Bob > > > Dangit, those don't work as you expected with XP and GNUwin32 > > > echo $(( $(wc -l < sourcefile.txt) / 4 )) > > The system cannot find the file specified. > > > l=$(( $(wc -l < sourcefile.txt) / 4 )) > > The system cannot find the file specified. > > > sed --in-place "1,${l}d" sourcefile.txt > > sed: -e expression #1, char 7: extra characters after command > > To set the output of a command to a variable in cmd.exe, use this hack: > > for /f %i in ('"<command>"') do set VARIABLE=%i > > So for your problem, here's a solution: > > for /f %i in ('"wc -l sourcefile.txt"') do set /a LINES=%i/4 > split -l %LINES% sourcefile.txt > > "set /a" instructs cmd.exe to evaluate a numerical expression. > > PS. John, I replied to your original message in alt.msdos.batch.nt (I was > reading newsgroups in alphabetical order). > > -- > Kam-Hung Soh <a href="http://kamhungsoh.com/blog">Software Salariman</a>
Thank you, Software Salariman! You're my hero du jour! Now, my eight-cpu box can go run two processes each using the four outfile-n.txt lists.... Here's the final module with a few tweaks, changing to GNUwin32 csplit (which works) from split (which returned this error) c:/progra~1/gnuwin32/bin/split.exe: invalid number Try 'c:/progra~1/gnuwin32/bin/split.exe --help' for more information Hope someone else can use this: rem SPLITTER.BAT which lives in C:\Program Files\GNUwin32\bin - that directory is in the PATH cls echo y | time | grep current | awk '{print $5,$6,$7}' > start echo off c: cd\ net use V: /delete net use V: \\server.division.corporation.com\area\directory\subdir / persistent:NO REM obtain list of files with fully qualified UNC filenames and use sed to strip off leading blanks attrib \\server.division.corporation.com\area\directory\subdir\*.* /s | sed "s/^...........//" > rawlist.txt REM Grep looks for lines with specific extensions and copies them to LIST.TXT grep "\.doc$" rawlist.txt > list.txt grep "\.docx$" rawlist.txt >> list.txt grep "\.wp$" rawlist.txt >> list.txt grep "\.rtf$" rawlist.txt >> list.txt grep "\.wpd$" rawlist.txt >> list.txt grep "\.wks$" rawlist.txt >> list.txt grep "\.txt$" rawlist.txt >> list.txt grep "\.odb$" rawlist.txt >> list.txt REM Count all the lines in list.txt, divide by 4 and put value in LINES for /f %%i in ('"wc -l list.txt"') do set /a LINES=%%i/4 REM use GNUwin32 to divide list.txt into four chunks, not splitting lines csplit -f outfile- -n 1 list.txt %LINES% {2} REM rename divided list files because I don't grok CSPLIT's syntax for how to do that ren outfile-? outfile-?.txt REM Show original list length, filtered list length and length of each divided list file wc -l rawlist.txt wc -l list.txt wc -l outfile-?.txt REM Show start & end time for the run: 4.5 secs to find 3,000 files, filter and divide echo y | time | grep current | awk '{print $5,$6,$7}' type start