BeeRich <[email protected]> wrote
You can copy duplicates to a new file or the clipboard, then count the lines
in BBEdit.
Yes, but using uniq -c provides a count followed by a single example
of each repeated line.
For example, the Apache logs for my website for the first 6 months of
2013 contain 214,049 entries, one per file access. These are processed
in BBEdit using grep-base search & replace to reduce each entry to a
simple file path, e.g.: "JACQCAD.COM/DOCS/INDEX.HTM". These entries
are then sorted using BBEdit's Sort.
uniq -c takes this file and in a couple seconds produces a text
listing of the 4,156 different files, prefixed by their counts, e.g.:
60 JACQCAD.COM/BETA_FILES/4_11B0__4_33B5_README.ZIP
5 JACQCAD.COM/BETA_FILES/JACQCAD_4_31B1_16JUL10.SITX
241 JACQCAD.COM/BETA_FILES/JACQCAD_4_33B1_30JUL11.SIT
A BBEdit sort of this file lists the files in order of frequency of
access - which makes it easy to find the most "popular" files.
563 JACQCAD.COM/IMAGEEDIT.HTML
573 JACQCAD.COM/MAINMENU.HTML
592 JACQCAD.COM/CONTACT.HTML
BBEdit's very efficient Search & Replace and Sort, in combination with
uniq, makes it possible to quickly process the voluminous Apache logs
into useful data.
--
Garth Fletcher
--
This is the BBEdit Talk public discussion group. If you have a
feature request or would like to report a problem, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>
---
You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].