page. how much memory is used to manage this?
I am not sure how deep the enumeration you want to know, but the first
approximation will be:
one struct vm_map_entry
one struct vm_object
one pv_entry

actually i don't need precise answer but algorithms.


Page table structures need four pages for directories and page table proper.

2) suppose we have 1TB file on disk without holes and 100000 processes
mmaps this file to it's address space. are just pages shared or can
pagetables be shared too? how much memory is used to manage such
situation?
Only pages are shared. Pagetables are not.

this is what i really asked, thank you for an answer. My example was rather extreme but datasets of tens of gigabytes would be used.

superpages are due to more efficient use of TLB.
actually this is not really working at least a while ago (but already in FreeBSD 8) i tested it. Even with 1GB squid process without any swapping it wasn't often allocating them.

Even with working case it probably will not help much here unless completely all data is in RAM, and following explains why

accurate tracking of the accesses and writes, which can result in better
pageout performance.

For the situation 1TB/100000 processes, you will probably need to tune
the amount of pv entries, see sysctl vm.pmap.pv*.

so there is a workaround but causing lots of soft page faults as there would be no more than few hundreds or so instructions between touching different pages.

What i want to do is database library (but no SQL!). It will be something alike (but definitely not the same and NOT compatible) CA-Clipper/Harbour or harbour but with higher performance and to use it including heavy cases.

With this system one user is one process, one thread. if used as WWW/something alike it will be this+some other thing doing WWW interface but still one logged user=exactly one process


As properly planned database tables should not be huge i assume most of them (possibly excluded parts that are mostly not used) will be kept in memory by VM subsystem. So hard faults and disk I/O will not be a deciding factor.

To avoid system calls i just want to mmap tables and indexes. All semaphores can be done from userspace too, and i already know how to avoid lock contention well.

Using indexes means doing lots of memory reads from different pages, but for every process it will be usually not all pages touched but small subset.

So it MAY work well this way, or may end with 95% system CPU time mostly doing soft faults.

But future question - is something for that case planned in FreeBSD? I
think i am not the only one about that, not all people on earth use computers for few processes or personal usage and there are IMHO many cases when programs need to share huge dataset using mmap, while doing heavy timesharing.

I understand that mmap works that way because it may be mapped in different places and even with parts of single file in different places as this is what mmap allows.

But is it possible to make different mmap in kernel like that

mmap_fullfile(fd,maxsize)

which (taking amd64 case) will map file at 2MB boundary if maxsize<=2MB, 1GB boundary if maxsize<=1GB, 512GB boundary otherwise, with subsequent multiple 512GB address blocks if needed, and sharing everything?

it is completely no problem that things like madvise from one process will clean madvise setting from other process, or other problems - as only one type of programs that are aware of this would use it.

this way there will be practicaly no pagetable mapping overhead and actually simpler/faster OS duties.

I don't really know how exactly VM subsystem works under FreeBSD but if it is not hard i may do this with some help from you.

And no - i don't want to use any popular database systems for good reasons.


_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to