On 01/26/2018, at 20:56, Doug Lerner <[email protected] <mailto:[email protected]>>
wrote:
> What I would like to do is find everything between the ID_User_ and *-find
> (e.g. .5a82483a in this example) and be left with a file where each line
> contains just that userId. After that I can sort it, remove duplicates, etc.
>
> Is there a sequence of things I can do, using grep search patterns, and so
> on, to create a file from this file containing just the userIds?
Hey Doug,
This is the sort of job Perl is very good at.
#!/usr/bin/env perl -sw
use v5.010;
use open qw(:std :utf8);
use utf8;
# ----------------------------------------------------------------
# Auth: Christopher Stone
# dCre: 2018/02/04 22:00
# dMod: 2018/02/04 22:16
# Task: Find a regular expression per line, sort, and remove duplicates.
# Tags: @Shell, @Script, @Find, @Regular, @Expression, @Per, @Line, @Sort,
@Remove, @Duplicates
# ----------------------------------------------------------------
my (@Array, @Unique, %Hash, $Key);
while (<>) {
if ( /ID_User=(.*?)-find/ ) {
$Hash{$1} = 1;
}
}
foreach $Key (sort keys %Hash) {
say $Key;
}
The script processes a 100,000 line file in about a second on my old 2010
MacBook Pro.
Now just for fun let's try that with a 1-line Bash script.
#!/usr/bin/env bash
LC_ALL='C'
sed -En '/ID_User=.*-find/{ s!ID_User=(.*)-find!\1!;p; }' | sort -u
Run either one of these as a BBEdit text-filter
<http://bbeditextras.org/wiki/index.php?title=Text_Filters>.
If you want to keep the original data file then run on a copy.
--
Best Regards,
Chris
--
This is the BBEdit Talk public discussion group. If you have a
feature request or would like to report a problem, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/bbedit.