BeeRich <[email protected]> wrote
 You can copy duplicates to a new file or the clipboard, then count the lines 
in BBEdit.

Yes, but using  uniq -c  provides a count followed by a single example
of each repeated line.

For example, the Apache logs for my website for the first 6 months of
2013 contain 214,049 entries, one per file access.  These are processed
in BBEdit using grep-base search & replace to reduce each entry to a
simple file path, e.g.: "JACQCAD.COM/DOCS/INDEX.HTM". These entries
are then sorted using BBEdit's Sort.

uniq -c  takes this file and in a couple seconds produces a text
listing of the 4,156 different files, prefixed by their counts, e.g.:

  60 JACQCAD.COM/BETA_FILES/4_11B0__4_33B5_README.ZIP
   5 JACQCAD.COM/BETA_FILES/JACQCAD_4_31B1_16JUL10.SITX
 241 JACQCAD.COM/BETA_FILES/JACQCAD_4_33B1_30JUL11.SIT

A BBEdit sort of this file lists the files in order of frequency of
access - which makes it easy to find the most "popular" files.

 563 JACQCAD.COM/IMAGEEDIT.HTML
 573 JACQCAD.COM/MAINMENU.HTML
 592 JACQCAD.COM/CONTACT.HTML

BBEdit's very efficient Search & Replace and Sort, in combination with
uniq, makes it possible to quickly process the voluminous Apache logs
into useful data.
--
Garth Fletcher

--
This is the BBEdit Talk public discussion group. If you have a feature request or would like to report a problem, please email
"[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

--- You received this message because you are subscribed to the Google Groups "BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].


Reply via email to