John, thanks a lot for your excellent reply. 

Especially, I think this sentence is very convincing, 

> "Well, you _can_ be a lot better since you know what
you're 
> doing. You can also be a _lot_ worse when you get it
wrong.

With such a high risk, probably I should try other
tricks to improve the system performance, before
rushing into the implementation of cache. 

thanks again,
Kan



--- John Haxby <[EMAIL PROTECTED]> wrote:

> Kan Deng wrote:
> 
> >1. Performance. 
> >
> >   Since all the cached disk data resides outside
> JVM
> >heap space, the access efficiency from Java object
> to
> >those cached data cannot be too high.
> >  
> >
> True, but you need to compare the relative speeds.  
> If data has to be 
> pulled from a file, then you're talking several
> milliseconds to fetch 
> from the disk.  If it's in the OS's cache (and here
> I'm rather assuming 
> Linux since that's what I know about) you're talking
> about microseconds 
> rather than milliseconds to fetch the data from the
> OS.   Once the data 
> is in the JVM, but not in the CPU cache, then you're
> down to nanosecods 
> to get the data from main memory (how many depends
> on the hardware; some 
> platforms take a while to get the data moving but
> when it comes, it's 
> very quick; some systems are fast to get going but
> don't have the 
> throughput).   It's not the absolute times that are
> important though: 
> once you've got the data in the OS's cache then
> things like network 
> latency, display update speed and scheduling
> overheads begin to make 
> themselves felt and you won't make these any less by
> getting data into 
> the JVM's memory.   Well, not much anyway.
> 
> >2. Volatile.
> >
> >   Since the OS caches the disk data in a common
> area
> >shared by multiple processes, but not only JVM. If
> >there are other processes doing disk IO at the same
> >time, chances are the cached Lucene index data from
> >disk may be wiped. 
> >  
> >
> What you can do by hanging on to a lot of memory is
> make the overall 
> machine performance worse.  In fact by denying other
> processes memory, 
> you're going to force up the I/O rate and when you
> do need to go to the 
> disk then it'll take much longer -- net result,
> things run slower.    
> Generally speaking, because the OS has a more
> holistic view of resource 
> management, you'll get better overall performance.
> 
> >Therefore, a more reliable and efficient cache
> should
> >reside inside JVM heap space. But due to the
> crowded
> >JVM heap space, we have to manually "evict" the
> less
> >frequently used data from the cache. 
> >  
> >
> It's that last sentence that is the critical one.  
> Yes, you can do your 
> own cache management, but how much better are you
> going to be than the 
> OS?    Well, you _can_ be a lot better since you
> know what you're 
> doing.   You can also be a _lot_ worse when you get
> it wrong.   Choosing 
> the right point to flush data from the cache
> ("evict") is not all that 
> straightforward: the OS buffer cache was introduced
> into BSD unix in the 
> early '80s and we're still seeing work going on to
> improve the basic 
> strategy 20-odd years later.
> 
> If you find that you're spending an inordinate
> amount of time waiting 
> for I/O for the index from the OS, then that it the
> time to start 
> looking at caching strategies.   My own feeling is
> that you're going to 
> find easier things to fix before you get that far.
> 
> >Did I mis-understand anything?
> >  
> >
> Probably not, it's just that performance is more of
> an holistic approach 
> and an obvious, isolated, change isn't going to have
> the effect that you 
> want.
> 
> jch
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> [EMAIL PROTECTED]
> For additional commands, e-mail:
> [EMAIL PROTECTED]
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to