Re: [sqlite] MMap On Solaris

2007-06-14 Thread John Stanton
The behaviour depends on whether you map shared or not.  If for map 
shared multiple users can read and write to the file simultaneously.  If 
you have a situation where you access he same bytes you need to use some 
form of synchronization, just as you do with read and write.


You can map for exclusive access and also for private.  In the provate 
case other users do not see your changes to the file.


If the file has been extended by another user past the area you have 
mapped you will not access it unless you mmap to the new length.  If the 
file is growing fast that could make using read and write more appropriate.


Ken wrote:
John, 
 
 You seem pretty knowledgable regarding MMAP. I was wondering if you could help me with this MMAP scenario:
 
  I'm curious as to how the OS and multple processes interact regarding file i/o and mmap.
 
Process A --- Writes to a file sequentially using either pwrite or kaio.
 
 I would like to write a process B. That  performs a read against what was written by A.

 I'm able to coordinate where to stop the read in other words I don't want to read more 
than what has been written by A.  Currently I'm just using os calls to "read" 
but I thought that maybe MMAP might give better performance especially if the OS would 
just provide the written buffers performed by Process A to Process  B's address space 
that is MMAPed.
 
 Thanks for any guidance.

 Ken
 

John Stanton <[EMAIL PROTECTED]> wrote: MMAP just lets you avoid one or two layers of buffering and APIs.  If 
you were to use fopen/fread you go to function calls then open/read plus 
buffering and function calls then to to the VM to actually access the 
data.  Going direct to the VM and getting a pointer to the VM pages is 
more efficient.  I got about 30% better speed out of one of my compilers 
just by removing the reads and local buffering and using mmap.  A b-tree 
index almost doubled in speed by removing local reads and buffering and 
using mmap.


Mitchell Vincent wrote:


Hi John! Thanks for the reply!

I think that makes a good point that the vm page fault is probably
faster than the overhead of copying the data to a local buffer.  So, page
fault or not, I think that's the way I'm going to do it.

Again, thanks very much for your input!

On 6/12/07, John Stanton  wrote:



Mitchell Vincent wrote:


Working with some data conversion here (that will eventually go into
an SQLite database). I'm hoping you IO wizards can offer some help on
a question that I've been trying to get answered.

I'm using Solaris 10 for this.

If I mmap a large file and use madvise with MADV_SEQUENTIAL and
MADV_WILLNEED, then start processing the file, when will the system
discard pages that have been referenced? I guess what I'm wondering is
if there is any retention of "back" pages?

Say for example I start reading the file, and after consuming 24,576
bytes, will the first or second pages still be in memory (assuming
8192 byte pages)?

Thanks!



In general it means that the file is mapped into virtual memory.  How
much of it remains in actual memory depends upon the memory demands on
the OS at the time.  If the sequential and random advice is used by the
OS it is most likely to implement a look ahead for requential access.
Not all OS's pay attention to those advisory settings.

What you are doing is to access the file as if it were an executing
program image.  Similar rules apply.

The answer is that you cannot assume that pages you have read are in
actual memory and you cannot assume that they are not.  When you access
a page not currently in memory the OS will read it in and find space for
it somehow, maybe by discarding some other page.

This is an excellent way to read files because you avoid one level of
buffer shadowing and get cacheing adjusted to currently available memory.

- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 










-
To unsubscribe, send email to [EMAIL PROTECTED]
-






-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] MMap On Solaris

2007-06-13 Thread Ken
John, 
 
 You seem pretty knowledgable regarding MMAP. I was wondering if you could help 
me with this MMAP scenario:
 
  I'm curious as to how the OS and multple processes interact regarding file 
i/o and mmap.
 
Process A --- Writes to a file sequentially using either pwrite or kaio.
 
 I would like to write a process B. That  performs a read against what was 
written by A.
 I'm able to coordinate where to stop the read in other words I don't want to 
read more than what has been written by A.  Currently I'm just using os calls 
to "read" but I thought that maybe MMAP might give better performance 
especially if the OS would just provide the written buffers performed by 
Process A to Process  B's address space that is MMAPed.
 
 Thanks for any guidance.
 Ken
 

John Stanton <[EMAIL PROTECTED]> wrote: MMAP just lets you avoid one or two 
layers of buffering and APIs.  If 
you were to use fopen/fread you go to function calls then open/read plus 
buffering and function calls then to to the VM to actually access the 
data.  Going direct to the VM and getting a pointer to the VM pages is 
more efficient.  I got about 30% better speed out of one of my compilers 
just by removing the reads and local buffering and using mmap.  A b-tree 
index almost doubled in speed by removing local reads and buffering and 
using mmap.

Mitchell Vincent wrote:
> Hi John! Thanks for the reply!
> 
> I think that makes a good point that the vm page fault is probably
> faster than the overhead of copying the data to a local buffer.  So, page
> fault or not, I think that's the way I'm going to do it.
> 
> Again, thanks very much for your input!
> 
> On 6/12/07, John Stanton  wrote:
> 
>> Mitchell Vincent wrote:
>> > Working with some data conversion here (that will eventually go into
>> > an SQLite database). I'm hoping you IO wizards can offer some help on
>> > a question that I've been trying to get answered.
>> >
>> > I'm using Solaris 10 for this.
>> >
>> > If I mmap a large file and use madvise with MADV_SEQUENTIAL and
>> > MADV_WILLNEED, then start processing the file, when will the system
>> > discard pages that have been referenced? I guess what I'm wondering is
>> > if there is any retention of "back" pages?
>> >
>> > Say for example I start reading the file, and after consuming 24,576
>> > bytes, will the first or second pages still be in memory (assuming
>> > 8192 byte pages)?
>> >
>> > Thanks!
>> >
>> In general it means that the file is mapped into virtual memory.  How
>> much of it remains in actual memory depends upon the memory demands on
>> the OS at the time.  If the sequential and random advice is used by the
>> OS it is most likely to implement a look ahead for requential access.
>> Not all OS's pay attention to those advisory settings.
>>
>> What you are doing is to access the file as if it were an executing
>> program image.  Similar rules apply.
>>
>> The answer is that you cannot assume that pages you have read are in
>> actual memory and you cannot assume that they are not.  When you access
>> a page not currently in memory the OS will read it in and find space for
>> it somehow, maybe by discarding some other page.
>>
>> This is an excellent way to read files because you avoid one level of
>> buffer shadowing and get cacheing adjusted to currently available memory.
>>
>> -
>>  
>>
>> To unsubscribe, send email to [EMAIL PROTECTED]
>> -
>>  
>>
>>
>>
> 
> 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-




Re: [sqlite] MMap On Solaris

2007-06-13 Thread John Stanton
MMAP just lets you avoid one or two layers of buffering and APIs.  If 
you were to use fopen/fread you go to function calls then open/read plus 
buffering and function calls then to to the VM to actually access the 
data.  Going direct to the VM and getting a pointer to the VM pages is 
more efficient.  I got about 30% better speed out of one of my compilers 
just by removing the reads and local buffering and using mmap.  A b-tree 
index almost doubled in speed by removing local reads and buffering and 
using mmap.


Mitchell Vincent wrote:

Hi John! Thanks for the reply!

I think that makes a good point that the vm page fault is probably
faster than the overhead of copying the data to a local buffer.  So, page
fault or not, I think that's the way I'm going to do it.

Again, thanks very much for your input!

On 6/12/07, John Stanton <[EMAIL PROTECTED]> wrote:


Mitchell Vincent wrote:
> Working with some data conversion here (that will eventually go into
> an SQLite database). I'm hoping you IO wizards can offer some help on
> a question that I've been trying to get answered.
>
> I'm using Solaris 10 for this.
>
> If I mmap a large file and use madvise with MADV_SEQUENTIAL and
> MADV_WILLNEED, then start processing the file, when will the system
> discard pages that have been referenced? I guess what I'm wondering is
> if there is any retention of "back" pages?
>
> Say for example I start reading the file, and after consuming 24,576
> bytes, will the first or second pages still be in memory (assuming
> 8192 byte pages)?
>
> Thanks!
>
In general it means that the file is mapped into virtual memory.  How
much of it remains in actual memory depends upon the memory demands on
the OS at the time.  If the sequential and random advice is used by the
OS it is most likely to implement a look ahead for requential access.
Not all OS's pay attention to those advisory settings.

What you are doing is to access the file as if it were an executing
program image.  Similar rules apply.

The answer is that you cannot assume that pages you have read are in
actual memory and you cannot assume that they are not.  When you access
a page not currently in memory the OS will read it in and find space for
it somehow, maybe by discarding some other page.

This is an excellent way to read files because you avoid one level of
buffer shadowing and get cacheing adjusted to currently available memory.

- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 










-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] MMap On Solaris

2007-06-12 Thread Mitchell Vincent

Hi John! Thanks for the reply!

I think that makes a good point that the vm page fault is probably
faster than the overhead of copying the data to a local buffer.  So, page
fault or not, I think that's the way I'm going to do it.

Again, thanks very much for your input!

On 6/12/07, John Stanton <[EMAIL PROTECTED]> wrote:

Mitchell Vincent wrote:
> Working with some data conversion here (that will eventually go into
> an SQLite database). I'm hoping you IO wizards can offer some help on
> a question that I've been trying to get answered.
>
> I'm using Solaris 10 for this.
>
> If I mmap a large file and use madvise with MADV_SEQUENTIAL and
> MADV_WILLNEED, then start processing the file, when will the system
> discard pages that have been referenced? I guess what I'm wondering is
> if there is any retention of "back" pages?
>
> Say for example I start reading the file, and after consuming 24,576
> bytes, will the first or second pages still be in memory (assuming
> 8192 byte pages)?
>
> Thanks!
>
In general it means that the file is mapped into virtual memory.  How
much of it remains in actual memory depends upon the memory demands on
the OS at the time.  If the sequential and random advice is used by the
OS it is most likely to implement a look ahead for requential access.
Not all OS's pay attention to those advisory settings.

What you are doing is to access the file as if it were an executing
program image.  Similar rules apply.

The answer is that you cannot assume that pages you have read are in
actual memory and you cannot assume that they are not.  When you access
a page not currently in memory the OS will read it in and find space for
it somehow, maybe by discarding some other page.

This is an excellent way to read files because you avoid one level of
buffer shadowing and get cacheing adjusted to currently available memory.

-
To unsubscribe, send email to [EMAIL PROTECTED]
-





--
- Mitchell Vincent
- K Software - Innovative Software Solutions
- Visit our website and check out our great software!
- http://www.ksoftware.net

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] MMap On Solaris

2007-06-12 Thread John Stanton

Mitchell Vincent wrote:

Working with some data conversion here (that will eventually go into
an SQLite database). I'm hoping you IO wizards can offer some help on
a question that I've been trying to get answered.

I'm using Solaris 10 for this.

If I mmap a large file and use madvise with MADV_SEQUENTIAL and
MADV_WILLNEED, then start processing the file, when will the system
discard pages that have been referenced? I guess what I'm wondering is
if there is any retention of "back" pages?

Say for example I start reading the file, and after consuming 24,576
bytes, will the first or second pages still be in memory (assuming
8192 byte pages)?

Thanks!

In general it means that the file is mapped into virtual memory.  How 
much of it remains in actual memory depends upon the memory demands on 
the OS at the time.  If the sequential and random advice is used by the 
OS it is most likely to implement a look ahead for requential access. 
Not all OS's pay attention to those advisory settings.


What you are doing is to access the file as if it were an executing 
program image.  Similar rules apply.


The answer is that you cannot assume that pages you have read are in 
actual memory and you cannot assume that they are not.  When you access 
a page not currently in memory the OS will read it in and find space for 
it somehow, maybe by discarding some other page.


This is an excellent way to read files because you avoid one level of 
buffer shadowing and get cacheing adjusted to currently available memory.


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] MMap On Solaris

2007-06-12 Thread Mitchell Vincent

Working with some data conversion here (that will eventually go into
an SQLite database). I'm hoping you IO wizards can offer some help on
a question that I've been trying to get answered.

I'm using Solaris 10 for this.

If I mmap a large file and use madvise with MADV_SEQUENTIAL and
MADV_WILLNEED, then start processing the file, when will the system
discard pages that have been referenced? I guess what I'm wondering is
if there is any retention of "back" pages?

Say for example I start reading the file, and after consuming 24,576
bytes, will the first or second pages still be in memory (assuming
8192 byte pages)?

Thanks!

--
- Mitchell Vincent
- K Software - Innovative Software Solutions
- Visit our website and check out our great software!
- http://www.ksoftware.net

-
To unsubscribe, send email to [EMAIL PROTECTED]
-