On 01/26/2018, at 20:56, Doug Lerner <d...@lerner.net <mailto:d...@lerner.net>> 
wrote:
> What I would like to do is find everything between the ID_User_ and *-find 
> (e.g. .5a82483a in this example) and be left with a file where each line 
> contains just that userId. After that I can sort it, remove duplicates, etc.
> 
> Is there a sequence of things I can do, using grep search patterns, and so 
> on, to create a file from this file containing just the userIds?


Hey Doug,

This is the sort of job Perl is very good at.


#!/usr/bin/env perl -sw
use v5.010;
use open qw(:std :utf8);
use utf8;
# ----------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2018/02/04 22:00
# dMod: 2018/02/04 22:16 
# Task: Find a regular expression per line, sort, and remove duplicates.
# Tags: @Shell, @Script, @Find, @Regular, @Expression, @Per, @Line, @Sort, 
@Remove, @Duplicates
# ----------------------------------------------------------------

my (@Array, @Unique, %Hash, $Key);

while (<>) {
    if ( /ID_User=(.*?)-find/ ) {
        $Hash{$1} = 1;
    }
}

foreach $Key (sort keys %Hash) {
    say $Key;
}


The script processes a 100,000 line file in about a second on my old 2010 
MacBook Pro.


Now just for fun let's try that with a 1-line Bash script.


#!/usr/bin/env bash
LC_ALL='C'

sed -En '/ID_User=.*-find/{ s!ID_User=(.*)-find!\1!;p; }' | sort -u


Run either one of these as a BBEdit text-filter 
<http://bbeditextras.org/wiki/index.php?title=Text_Filters>.

If you want to keep the original data file then run on a copy.

--
Best Regards,
Chris

-- 
This is the BBEdit Talk public discussion group. If you have a 
feature request or would like to report a problem, please email
"supp...@barebones.com" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bbedit+unsubscr...@googlegroups.com.
To post to this group, send email to bbedit@googlegroups.com.
Visit this group at https://groups.google.com/group/bbedit.

Reply via email to