jj that seems to work to work to process lines containing so i can break down file of 200000 ftp links to each letter for easier downloading
thanks On Sunday, February 6, 2022 at 6:02:19 AM UTC-5 jj wrote: > Find: > > (?<=/)(?:d([^\s/]|\\\x20)*?\.zip)\b > > Or commented: > > (?x) (?# Use multi-line and comments) > (?<=/) (?# Look behind a slash not including it in the match) > ( (?# Start of capture \1) > d (?# Literal 'd') > (?: (?# Start non capturing parentheses) > [^\s/] (?# NOT [whitespace or slash] character) > | (?# or) > \\\x20 (?# Backslash escaped space) > ) (?# End non capturing parentheses) > *? (?# Match 0 or more greedily) > \. (?# Literal '.') > zip (?# Literal 'zip') > ) (?# End of capture \1) > \b (?# Word boundary) > > Should match: > ftp://ftp.scene.org/pub/demos/artists/0xf/drunkchessboard.zip > ftp://ftp.scene.org/pub/demos/artists/0xf/d.zip > "ftp://ftp.scene.org/pub/demos/artists/0xf/d0xf+==&.zip" > /path/to/unicode/files/d你好.zip > /path/to/document\ with_escaped_space.zip > > Should NOT match: > ftp://ftp.scene.org/pub/demos/artists/0xf/d.zipped -- Wrong > extension. > /path/to/document with_unescaped_space.zip -- Has unescaped > space. > document.zip -- Missing /. > > HTH > > Jean Jourdain > > On Sunday, February 6, 2022 at 10:39:09 AM UTC+1 Kaveh wrote: > >> not clear for me what you want to do. can you put a sample of input lines >> and output needed? >> >> On Sat, 5 Feb 2022 at 23:09, ejonesss <[email protected]> wrote: >> >>> i was wondering what is the grep i would need to find all occurrences of >>> a word that begins with >>> >>> ftp://ftp.scene.org/pub/demos/artists/0xf/drunkchessboard.zip >>> >>> >>> for example i want to find all lines who has file of “.zip" and begins >>> with “d" >>> >>> drunkchessboard.zip >>> >>> i got the finding .zip part ok that is how i extracted all the zips from >>> a massive 600000 line list >>> >>> now the tricky part is detecting the “/d” part >>> >>> >>> >>> -- >>> This is the BBEdit Talk public discussion group. If you have a feature >>> request or need technical support, please email "[email protected]" >>> rather than posting here. Follow @bbedit on Twitter: < >>> https://twitter.com/bbedit> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "BBEdit Talk" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/bbedit/9029aa96-cd05-4724-8126-4ea34ef23e99n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/bbedit/9029aa96-cd05-4724-8126-4ea34ef23e99n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> >> >> -- >> Kaveh Bazargan PhD >> Director >> River Valley Technologies <http://rivervalley.io> ● Twitter >> <https://twitter.com/rivervalley1000> ● LinkedIn >> <https://www.linkedin.com/in/bazargankaveh/> ● ORCID >> <https://orcid.org/0000-0002-1414-9098> >> *Accelerating the Communication of Research* >> > -- This is the BBEdit Talk public discussion group. If you have a feature request or need technical support, please email "[email protected]" rather than posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit> --- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/bbedit/9202326d-f576-468f-805c-7529958b40f1n%40googlegroups.com.
