On Mon, Aug 16, 2021 at 7:25 PM wes wrote:
> To get the count of unique callsigns, you can just feed this same command
> into wc -l.
>
> find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c | wc -l
>
> -wes
>
>
> On Mon, Aug 16, 2021 at 7:21 PM wes wrote:
>
> > if the @ is consistent with all the files, that makes it relatively easy.
> >
> > find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c
> >
> > -wes
> >
> > On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes
> > wrote:
> >
> >> On Mon, Aug 16, 2021 at 5:29 PM David Fleck
> >> wrote:
> >>
> >> > As Wes said, an example or two would help greatly.
> >> >
> >> > --- David Fleck
> >> >
> >> > ‐‐‐ Original Message ‐‐‐
> >> >
> >> > On Monday, August 16th, 2021 at 7:17 PM, wes
> wrote:
> >> >
> >> > > are firstnames and lastnames always separated by the same character
> in
> >> > each
> >> > >
> >> > > filename?
> >> > >
> >> > > are the names separated from the rest of the info in the filename
> the
> >> > same
> >> > >
> >> > > way for each file?
> >> > >
> >> > > are you doing this once, or will this be a repeating task that would
> >> be
> >> > >
> >> > > handy to automate?
> >> > >
> >> > > would you be able to provide a few same filenames, perhaps with the
> >> > >
> >> > > personal info obfuscated?
> >> > >
> >> > > generally, the way I would approach this is to pare the filenames
> >> down to
> >> > >
> >> > > the people's names, and then run uniq against that list. uniq -c
> will
> >> > >
> >> > > provide a count of how many times a given string appears in the
> >> input. if
> >> > >
> >> > > I'm doing this once, I would generate a text file containing the
> list
> >> of
> >> > >
> >> > > filenames I will be working with, for example:
> >> > >
> >> > > find Processed -type f > processed-files.txt
> >> > >
> >> > > then use a text editor to pare down the entries as described above,
> >> using
> >> > >
> >> > > find and replace functions to remove the extra data, so only the
> >> people's
> >> > >
> >> > > names remain. then simply uniq -c that file and you're done. I
> >> personally
> >> > >
> >> > > use vi for this, but just about any editor will do. I like this
> >> approach
> >> > >
> >> > > for a number of reasons, not the least of which is that I can
> >> spot-check
> >> > >
> >> > > random samples after each editing step to try to spot unexpected
> >> results.
> >> > >
> >> > > if you want to automate this, it may be a little more complicated,
> and
> >> > the
> >> > >
> >> > > answers to my initial questions become important. if you can
> provide a
> >> > >
> >> > > little more context, I will try to help further.
> >> > >
> >> > > -wes
> >> > >
> >> > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes
> barnmich...@gmail.com
> >> > >
> >> > > wrote:
> >> > >
> >> > > > Here's a fun trivia task. For an activity I am involved in, I get
> >> files
> >> > > >
> >> > > > from members to process. The filename starts with the member's
> name
> >> > and has
> >> > > >
> >> > > > other info to identify the file. After processing, the file goes
> in
> >> the
> >> > > >
> >> > > > ./Processed folder. There are thousands of files now in that
> folder.
> >> > Right
> >> > > >
> >> > > > now, I'm looking for a couple basic pieces of information. First,
> I
> >> > want to
> >> > > >
> >> > > > know how many unique names I have in the list. Second, I'd like a
> >> list
> >> > of
> >> > > >
> >> > > > names and how many files go with each name.
> >> > > >
> >> > > > I'm sure this is trivial, but my mind is blanking out on it. A
> >> couple
> >> > > >
> >> > > > simple examples would be nice. Non-answers, like "easy to do
> >> > with'xxx'" or
> >> > > >
> >> > > > references to man pages or George's Book, etc. are not helpful
> right
> >> > now.
> >> > > >
> >> > > > Thanks,
> >> > > >
> >> > > > Michael
> >> >
> >>
> >> Actually, they are callsigns instead of names. A couple of examples:
> >>
> >> w7...@k-0496-20210526.txt
> >> wa7...@k-0497-20210714.txt
> >> n8...@k-4386-20210725.txt
> >>
> >> I would like a simple count of the unique callsigns on a random basis
> and
> >> possibly an occasional report listing each callsign and how many files
> are
> >> in the folder for each.
> >>
> >> Michael
> >>
> >
>
Thanks Everybody,
This has been educational for me. It looks like there were several working
options. I started with Wes' option refined by Robert.
$ find -type f | cut -d @ -f1 | sort | uniq -c
Since I was working from within the /Processed folder, I did not specify it
on the command line.
Then, I discovered some of the callsigns were not capitalized, so I added
the ignore case option.
$ find -type f | cut -d @ -f1 | sort | uniq -i -c
That gave me usable output #1.
I added the count with
$ find -type f | cut -d @ -f1 | sort | uniq -i -c | wc -l
Which gave me output #2.
Finally, I added another sort to give Output #3 for the frequency option.
$ find -type f | cut -d @ -f1 | sort | uniq -i -c | sort -n
I gave Wes'