Ken Hornstein wrote:
> Exactly HOW many messages are in mhe-index?
>
> Ah, I think I see what's happening. That line is this:
>
> mp->msgstats = mh_xmalloc (MSGSTATSIZE(mp));
>
> MSGSTATSIZE is defined as:
>
> #define MSGSTATSIZE(mp) ((mp)->num_msgstats * sizeof *(mp)->msgstats)
>
> num_msgstats is set by the previous line:
>
> mp->num_msgstats = MSGSTATNUM (mp->lowoff, mp->hghoff);
>
> Which is defined as:
>
> #define MSGSTATNUM(lo, hi) ((size_t) ((hi) - (lo) + 1))
>
> So ... the summary here is that nmh (and MH before it) allocates a
> "message status" element for every possible message. The possible
> number of messages is the range between the LOWEST message number and
> the HIGHEST message number. So if you just had 1000000 and 10000002 in
> a folder, it would allocate 3 elements. But if you had 1 and 1000000,
> it would allocate a million elements. A msgstat structure is an array
> of "struct bvector" which might be ... 8 + 8 + 16 bytes per message on
> a 64 bit platform. That suggests there are either 1320920404 messages
> in that folder (1.2 billion) or there's a huge message number gap (that
> has come up before when someone had a huge gap; the my memory is the
> consensus was you just had to deal).
Possibly somewhat related, Greg mentioned he uses mairix for search.
mairix produces very "sparse" results folders. For example:
thoreau 52115> mairix caffeine
Matched 111 messages
thoreau 52116> f +vfolder
vfolder+ has 111 messages (47-782143); cur=650783.
>From Ken's description above, these 111 messages would allocate almost
800,000 msgstat structures. I don't know how huge the message numbers
get in the results folder, but six digits is common. I don't recall if
I've seen seven digit or larger message numbers.
Cheers,
Simon.