I realized that LibreOffice Calc. cannot handle more than a million rows ...
Spreadsheets are only meant to do computation on little data. To store many
data, use text files or a database management system.
I decided to sort the file, first on Column 3, then Column 1, and then on
Column 2.
Accordingly, I rearranged the columns thusly: $3, $1, $2, $4 and sorted with:
"sort -nrk 1,4" where "nr" puts the biggest numbers at the top of the column,
but sort evidently did not reach to the third column, resulting in an
ordering of only hostname and visits-per-domain.
'sort -k 1,4' uses the part of the line up to column 4 (one single string) to
sort. It is not what you want, and neither is 'sort -k 1,3'. What you want
("sort the file, first on Column 3, then Column 1, and then on Column 2") is
achieved using three times option -k (where the order matters): 'sort -k 3,3
1,1 2,2'.
Again: read 'info sort'. At the end of it, there are even well-explained
examples with multiple -k options, starting with this one:
Sort numerically on the second field and resolve ties by sorting
alphabetically on the third and fourth characters of field five.
Use ‘:’ as the field delimiter.
sort -t : -k 2,2n -k 5.3,5.4
Note that if you had written ‘-k 2n’ instead of ‘-k 2,2n’
‘sort’
would have used all characters beginning in the second field and
extending to the end of the line as the primary _numeric_ key. For
the large majority of applications, treating keys spanning more
than one field as numeric will not do what you expect.
Also note that the ‘n’ modifier was applied to the field-end
specifier for the first key. It would have been equivalent to
specify ‘-k 2n,2’ or ‘-k 2n,2n’. All modifiers except ‘b’
apply to
the associated _field_, regardless of whether the modifier
character is attached to the field-start and/or the field-end part
of the key specifier.