I don't know of a tool that does this, but URL formatting is common for a lot of programming tasks. If you know python, setting up a small script that returns specific pieces of a URL is trivial.
https://docs.python.org/3/library/urllib.parse.html#module-urllib.parse Qt5 (and probably GTK too ) has similar URL parsing mechanisms, and you could probably find similar functionality in most high-level scripting languages through the appropriate module or library. Now whether or not a tool already exists that does this in a production friendly way... probably not, just example apps and code. The 'QUrl' object within Qt5 does a nice job of abstracting the components of a network location in C++ so there might be someone who threw up a quick little demo app on github. On Tue, Feb 5, 2019 at 8:50 PM David Barr <[email protected]> wrote: > Hey, Randall, > > To be pedantic, the tracking tags and such are all stuff that appear > after the question mark delimiting character in the HTTP PUT request, > right? `https://foo/bar/baz?evil_tag=evil` > <https://foo/bar/baz?evil_tag=evil> > > The trick then, is to select only the lines containing question marks, > and then delete from the question mark to the end of the line. Try this: > > ``` > sed -e '/\?/ s/\?.*$//' <file> > ``` > > Pedantry again: That's "select lines containing a (backslash escaped) > question mark," followed by "substitute all characters from and > including that (backslash escaped) question mark to the end of the line > ($) with nothing." > > I haven't tested this on a file, so I deserve whatever mockery I get if > I missed something. > > Cheers! > David > > On 2/5/19 2:48 PM, logical american wrote: > > Hi: > > > > Is there a linux tool which cleans up the URLs in a text file (I > > believe Western unicode encoding) so that all the tracking tags, > > fbclid, etc are removed and the pure URL is left in the text? > > > > In one recent email I received, there were 28 govdelivery.com tags and > > others embedded inside the URLs, and I don't wish the posted material > > to provide an easy access for the website to be tracked. > > > > Thanks > > > > Randall > > > > _______________________________________________ > PLUG mailing list > [email protected] > http://lists.pdxlinux.org/mailman/listinfo/plug > _______________________________________________ PLUG mailing list [email protected] http://lists.pdxlinux.org/mailman/listinfo/plug
