Much thanks to Stephan Fabel for sharing his version that goes about this differently and I might add more efficiently....
/brian chee ---------- Forwarded message ---------- From: Stephan Fabel <sfa...@hawaii.edu> Date: Thu, Jul 17, 2014 at 2:51 PM Subject: Re: [UHM-NETADMIN] Sharing a cute script for FTP based file grooming To: Brian Chee <c...@hawaii.edu> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi Brian, On 07/17/2014 02:07 PM, Brian Chee wrote: > *<Insert weird file listing off cRIO>* [..snip..] > *</Insert weird file listing off cRIO>* **NOTE: I'm searching for a > lessthan and then a greaterthan symbol since I don't have spaces to > search for in the string.* I've had good success using html2text.py (http://www.aaronsw.com/2002/html2text/) in conjunction with hearty sed and awk. For example, to extract the file names: html2text.py weirdfilelisting.html | sed '/^ *$/d;/^\*/d;/^\#/d' | awk 'NR>1{print $1}' RS=[ FS=] In essence, define the exclude filter with sed, and extract with awk (or do both in perl...;-)). Your wget outputs to a file, not sure if that's needed. You could probably just keep it in the pipe: wget --no-remove-listing --no-verbose -O - \ ftp://User:password@ipaddress/MainData/. 2>/dev/null | \ html2text.py | \ sed '/^ *$/d;/^\*/d;/^\#/d' | \ awk 'NR>1{print $1}' RS=[ FS=] | \ sort If you're just worried about filling up, you may not want to go by days, either. An alternative would be to just take the last 5 logs, no matter how old they are, using tail (note I'm reversing the sort): wget --no-remove-listing --no-verbose -O - \ ftp://User:password@ipaddress/MainData/. 2>/dev/null | \ html2text.py | \ sed '/^ *$/d;/^\*/d;/^\#/d' | \ awk 'NR>1{print $1}' RS=[ FS=] | \ sort -r | \ tail -n 5 | \ paste -sd" " - You could now iterate over them: for i in $(wget --no-remove-listing --no-verbose -O - \ ftp://User:password@ipaddress/MainData/. 2>/dev/null | \ html2text.py | \ sed '/^ *$/d;/^\*/d;/^\#/d' | \ awk 'NR>1{print $1}' RS=[ FS=] | \ sort | \ tail -n 5 | \ paste -sd" " -); do lftp ftp://User:password@IPaddress/MainData << EOLSTUFF rm $i EOLSTUFF; done I realize that this is very similar to your script, but I found HTML incredibly stupid to work with, and hate deleting temporary files (I find it good practice not to do a 'rm' call in a script that I hacked together in 5 minutes). Anyway, just for your consideration. Cheers, Stephan -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTyG+IAAoJEJs+ZpPmKxIxFmYH+wbnEqEWtJ3A2glWEOlAvACF IDYkNeS0MyiWZg8iF4V6DJd4/54YhHC7ZeuqzA54PpKTiwhZMvtZ98HzXSC7T7CL ehCXZw+wD3mp8k2CsmjXPjzPfghtte/bqGC/BCrJglQpPfCmJeZeF29y7TFWYLHL xXbGNpQNqiDrJLjNzN2EF3nFuBIfFfU1gmNtf9Ay0m/wW5ArHkfV8R0YUpUUKL+I bl9noVp2rOxV0D+mLuXATRilHl0yNJGXESSQMEQQNNPBHtr1G0jDIYFKG/lBk7oP NInz7eZQAgF0hUNCQyPWtUHhsY+nsBaGr4DIbcAHGaxplw7rB4ndSyjW1qT3sYU= =DbKj -----END PGP SIGNATURE----- -- ******************************************** University of Hawaii SOEST Advanced Network Computing Laboratory (ANCL) Brian Chee <c...@hawaii.edu> 2525 Correa Road, HIG 500 Honolulu, HI 96822 Office: 808-956-5797 _______________________________________________ LUAU@lists.freesoftwarehawaii.org mailing list http://lists.freesoftwarehawaii.org/listinfo.cgi/luau-freesoftwarehawaii.org