On Thu, May 07, 2015 at 11:58:14AM -0400, Jason Woofenden wrote: > On 2015-05-07 05:07PM, Jochen Sprickerhof wrote: > > * Jason Woofenden <[email protected]> [2015-05-07 10:09]: > > > pdftohtml -stdout foo.pdf | sed -ne 's/href="\([^"]\+\)"/\n\1\n/g' -e > > > 's/\(^[^\n]*\n\|\(\n\)\)\([^\n]*\)\n[^\n]*/\2\3/gp' > > > > I would use grep ;). Using my urlselct from [1] I would write: > > > > pdftotext foo.pdf - | urlselect > > > > Cheers Jochen > > > > [1] http://lists.suckless.org/dev/1504/26641.html > > Ooh, grep -o is great!
It is also a non-standard extension.
