From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]>
> You were running the test on the very memory-depend machine.
>> shared_buffers = 4GB / The scaling factor is 50, 800MB of data.
> Thet would be why the patch did not work. I tested it with DBT-2,
> data and 2GB of memory. Storage is always the main part of
> even not in checkpoints.
Yes, I used half the size of RAM as the shared buffers, which is
reasonable. And I cached all the data. The effect of fsync() is a
heavier offence, isn't it? System administrators would say "I have
enough memory. The data hasn't exhausted the DB cache yet. But the
users complain to me about the response. Why? What should I do?
What? Checkpoint?? Why doesn't PostgreSQL take care of frontend
BTW, is DBT-2 an OLTP benchmark which randomly access some parts of
data, or a batch application which accesses all data? I'm not
familiar with it. I know that IPA opens it to the public.
> If you use Linux, it has very unpleased behavior in fsync(); It
> metadata of the file being fsync-ed. We have to wait for the
> fsync when we do read(), write(), and even lseek().
> Almost of your data is in the accounts table and it was stored in a
> file. All of transactions must wait for fsync to the single largest
> so you saw the bottleneck was in the fsync.
Oh, really, what an evil fsync is! Yes, I sometimes saw a backend
waiting for lseek() to complete when it committed. But why does the
backend which is syncing WAL/pg_control have to wait for syncing the
data file? They are, not to mention, different files, and WAL and
data files are stored on separate disks.
>> I believe that the problem cannot be solved in a real sense by
>> avoiding fsync/fdatasync().
> I think so, too. However, I assume we can resolve a part of the
> checkpoint spikes with smoothing of write() alone.
First, what's the goal (if possible numerically? Have you explained
to community members why the patch would help many people? At least,
I haven't heard that fsync() can be seriously bad and we would close
our eyes to what fsync() does.
By the way, what good results did you get with DBT-2? If you don't
mind, can you show us?
> BTW, can we use the same way to fsync? We call fsync()s to all
> files without rest in mdsync(), but it's not difficult at all to
> sleeps between fsync()s. Do you think it helps us? One of issues is
> we have to sleep in file unit, which is maybe rough granularity.
No, it definitely won't help us. There is no reason why it will help.
It might help in some limited environments, though, how can we
characterize such environments? Can we say "our approach helps our
environments, but it won't help you. The kernel VM settings may help
you. Good luck!"?
We have to consider seriously. I think it's time to face the problem
and we should follow the approaches of experts like Jim Gray and DBMS
vendors, unless we have a new clever idea like them.
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?