RE: How to setup SimpleFSDirectoryFactory

2012-07-23 Thread Uwe Schindler
Hi Geetha Anjali,

Lucene will not use MMapDirectoy by default on 32 bit platforms or if you
are not using a Oracle/Sun JVM. On 64 bit platforms, Lucene will use it, but
will accept the risks of segfaulting when unmapping the buffers - Lucene
does try its best to prevent this. It is a risk, but accepted by the Lucene
developers.

To come back to your issue: It is perfectly fine on Solr/Lucene to not unmap
all buffers as long as the index is open. The number of open file handles is
another discussion, but not related at all to MMap, if you are using an old
Lucene version (like 3.0.2), you should upgrade in all cases The recent one
is 3.6.1.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: geetha anjali [mailto:anjaliprabh...@gmail.com]
 Sent: Monday, July 23, 2012 4:28 AM
 Subject: Re: How to setup SimpleFSDirectoryFactory
 
 Hu Uwe,
 Thanks Wwe, Have you checked the Bug in JRE for mmapDirectory?. I was
 mentioning this, This is posted in Oracle site, and the API doc.
 They accept this as a bug, have you seen this?.
 
 MMapDirectoryhttp://lucene.apache.org/java/3_0_2/api/core/org/apache/l
 u=ene/store/MMapDirectory.htmluses
 memory-mapped IO when reading. This is a good choice if you have plenty of
 virtual memory relative to your index size, eg if you are running on a 64
bit JRE,
 or you are running on a 32 bit JRE but your index sizes are small enough
to fit
 into the virtual memory space. Java has currently the limitation of not
being
 able to unmap files from user code. The files are unmapped, when GC
releases
 the byte buffers. *Due to this
 bughttp://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4724038in
 Sun's JRE,
 MMapDirectory's

**IndexInput.close()*http://lucene.apache.org/java/3_0_2/api/core/org/apac
 =e/lucene/store/IndexInput.html#close%28%29
 * is unable to close the underlying OS file handle. Only when GC finally
collects
 the underlying objects, which could be quite some time later, will the
file
 handle be closed*. *This will consume additional transient disk
 usage*: on Windows, attempts to delete or overwrite the files will result
in an
 exception; on other platforms, which typically have a delete on last
close
 semantics, while such operations will succeed, the bytes are still
consuming
 space on disk. For many applications this limitation is not a problem
(e.g. if you
 have plenty of disk space, and you don't rely on overwriting files on
Windows)
 but it's still an important limitation to be aware of. This class supplies
a
 (possibly dangerous) workaround mentioned in the bug report, which may
fail
 on non-Sun JVMs. 
 
 
 Thanks,
 
 
 On Mon, Jul 23, 2012 at 4:13 AM, Uwe Schindler u...@thetaphi.de wrote:
 
  It is hopeless to talk to both of you, you don't understand virtual
memor=:
 
   I get a similar situation using Windows 2008 and Solr 3.6. Memory
   using mmap=is never released. Even if I turn off traffic and commit
   and do =
  manual
   gc= If the size of the index is 3gb then memory used will be heap +
   3=b
  of
   sha=ed used. If I use a 6gb index I get heap + 6gb.
 
  That is expected, but we are talking not about allocated physical
  memory, we are talking about allocated ADDRESS SPACE and you have 2^47
  of that on 64bit platforms. There is no physical memory wasted or
  allocated - please read the blog post a third, forth, fifth... or
  tenth time, until it is obvious. Yo= should also go back to school and
  take a course on system programming and operating system kernels.
  Every CS student gets that taught in his first year (at least in
  Germany).
 
  Java's GC has nothing to do with that - as long as the index is open,
  ADDRESS SPACE is assigned. We are talking not about memory nor Java
  heap space.
 
   If I turn off
   MMapDirectory=actory it goes back down. When is the MMap supposed to
   release memory ? It o=ly does it on JVM restart now.
 
  Can you please stop spreading nonsense about MMapDirectory with no
  knowledge behind? http://www.linuxatemyram.com/ - Also applies to
  Windows.
 
  Uwe
 
   Bill Bell
   Sent from mobile
  
  
   On Jul 22, 2012, at 6:21 AM, geetha anjali
   anjaliprabh...@gmail.com wrote:=
It happens in 3.6, for this reasons I thought of moving to solandra.
If I do a commit, the all documents are persisted with out any
issues= There is no issues  in terms of any functionality, but
only this happens i= increase in physical RAM, goes higher and
higher and sto= at maximum and i= never comes down.
   
Thanks
   
On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog goks...@gmail.com
   wrote:
   
Interesting. Which version of Solr is this? What happens if you
do a commit?
   
On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
   anjaliprabh...@gmail.com= wrote:
Hi uwe,
Great to know. We have files indexing 1/min. After 30 mins I
se= all= my physical memory say its 100 percentage
used(windows

Re: How to setup SimpleFSDirectoryFactory

2012-07-23 Thread geetha anjali
Thanks a lot Uwe, will check out in the new 3.6.1


On Mon, Jul 23, 2012 at 11:46 AM, Uwe Schindler u...@thetaphi.de wrote:

 Hi Geetha Anjali,

 Lucene will not use MMapDirectoy by default on 32 bit platforms or if you
 are not using a Oracle/Sun JVM. On 64 bit platforms, Lucene will use it,
 but
 will accept the risks of segfaulting when unmapping the buffers - Lucene
 does try its best to prevent this. It is a risk, but accepted by the Lucene
 developers.

 To come back to your issue: It is perfectly fine on Solr/Lucene to not
 unmap
 all buffers as long as the index is open. The number of open file handles
 is
 another discussion, but not related at all to MMap, if you are using an old
 Lucene version (like 3.0.2), you should upgrade in all cases The recent one
 is 3.6.1.

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

  -Original Message-
  From: geetha anjali [mailto:anjaliprabh...@gmail.com]
  Sent: Monday, July 23, 2012 4:28 AM
  Subject: Re: How to setup SimpleFSDirectoryFactory
 
  Hu Uwe,
  Thanks Wwe, Have you checked the Bug in JRE for mmapDirectory?. I was
  mentioning this, This is posted in Oracle site, and the API doc.
  They accept this as a bug, have you seen this?.
 
  MMapDirectoryhttp://lucene.apache.org/java/3_0_2/api/core/org/apache/l
  u=ene/store/MMapDirectory.htmluses
  memory-mapped IO when reading. This is a good choice if you have plenty
 of
  virtual memory relative to your index size, eg if you are running on a 64
 bit JRE,
  or you are running on a 32 bit JRE but your index sizes are small enough
 to fit
  into the virtual memory space. Java has currently the limitation of not
 being
  able to unmap files from user code. The files are unmapped, when GC
 releases
  the byte buffers. *Due to this
  bughttp://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4724038in
  Sun's JRE,
  MMapDirectory's
 
 **IndexInput.close()*
 http://lucene.apache.org/java/3_0_2/api/core/org/apac
  =e/lucene/store/IndexInput.html#close%28%29
  * is unable to close the underlying OS file handle. Only when GC finally
 collects
  the underlying objects, which could be quite some time later, will the
 file
  handle be closed*. *This will consume additional transient disk
  usage*: on Windows, attempts to delete or overwrite the files will result
 in an
  exception; on other platforms, which typically have a delete on last
 close
  semantics, while such operations will succeed, the bytes are still
 consuming
  space on disk. For many applications this limitation is not a problem
 (e.g. if you
  have plenty of disk space, and you don't rely on overwriting files on
 Windows)
  but it's still an important limitation to be aware of. This class
 supplies
 a
  (possibly dangerous) workaround mentioned in the bug report, which may
 fail
  on non-Sun JVMs. 
 
 
  Thanks,
 
 
  On Mon, Jul 23, 2012 at 4:13 AM, Uwe Schindler u...@thetaphi.de wrote:
 
   It is hopeless to talk to both of you, you don't understand virtual
 memor=:
  
I get a similar situation using Windows 2008 and Solr 3.6. Memory
using mmap=is never released. Even if I turn off traffic and commit
and do =
   manual
gc= If the size of the index is 3gb then memory used will be heap +
3=b
   of
sha=ed used. If I use a 6gb index I get heap + 6gb.
  
   That is expected, but we are talking not about allocated physical
   memory, we are talking about allocated ADDRESS SPACE and you have 2^47
   of that on 64bit platforms. There is no physical memory wasted or
   allocated - please read the blog post a third, forth, fifth... or
   tenth time, until it is obvious. Yo= should also go back to school and
   take a course on system programming and operating system kernels.
   Every CS student gets that taught in his first year (at least in
   Germany).
  
   Java's GC has nothing to do with that - as long as the index is open,
   ADDRESS SPACE is assigned. We are talking not about memory nor Java
   heap space.
  
If I turn off
MMapDirectory=actory it goes back down. When is the MMap supposed to
release memory ? It o=ly does it on JVM restart now.
  
   Can you please stop spreading nonsense about MMapDirectory with no
   knowledge behind? http://www.linuxatemyram.com/ - Also applies to
   Windows.
  
   Uwe
  
Bill Bell
Sent from mobile
   
   
On Jul 22, 2012, at 6:21 AM, geetha anjali
anjaliprabh...@gmail.com wrote:=
 It happens in 3.6, for this reasons I thought of moving to
 solandra.
 If I do a commit, the all documents are persisted with out any
 issues= There is no issues  in terms of any functionality, but
 only this happens i= increase in physical RAM, goes higher and
 higher and sto= at maximum and i= never comes down.

 Thanks

 On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog goks...@gmail.com
wrote:

 Interesting. Which version of Solr is this? What happens if you
 do

Re: RE: How to setup SimpleFSDirectoryFactory

2012-07-22 Thread geetha anjali
It happens in 3.6, for this reasons I thought of moving to solandra.
If I do a commit, the all documents are persisted with out any issues.
There is no issues  in terms of any functionality, but only this happens is
increase in physical RAM, goes higher and higher and stop at maximum and it
never comes down.

Thanks

On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog goks...@gmail.com wrote:

 Interesting. Which version of Solr is this? What happens if you do a
 commit?

 On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali anjaliprabh...@gmail.com
 wrote:
  Hi uwe,
  Great to know. We have files indexing 1/min. After 30 mins I see all
  my physical memory say its 100 percentage used(windows). On deep
  investigation found that mmap is not releasing os files handles. Do you
  find this behaviour?
 
  Thanks
 
  On 20 Jul 2012 14:04, Uwe Schindler u...@thetaphi.de wrote:
 
  Hi Bill,
 
  MMapDirectory uses the file system cache of your operating system, which
 has
  following consequences: In Linux, top  free should normally report only
  *few* free memory, because the O/S uses all memory not allocated by
  applications to cache disk I/O (and shows it as allocated, so having 0%
 free
  memory is just normal on Linux and also Windows). If you have other
  applications or Lucene/Solr itself that allocate lot's of heap space or
  malloc() a lot, then you are reducing free physical memory, so reducing
 fs
  cache. This depends also on your swappiness parameter (if swappiness is
  higher, inactive processes are swapped out easier, default is 60% on
 linux -
  freeing more space for FS cache - the backside is of course that maybe
  in-memory structures of Lucene and other applications get pages out).
 
  You will only see no paging at all if all memory allocated all
 applications
  + all mmapped files fit into memory. But paging in/out the mmapped Lucene
  index is much cheaper than using SimpleFSDirectory or
 NIOFSDirectory. If
  you use SimpleFS or NIO and your index is not in FS cache, it will also
 read
  it from physical disk again, so where is the difference. Paging is
 actually
  cheaper as no syscalls are involved.
 
  If you want as much as possible of your index in physical RAM, copy it to
  /dev/null regularily and buy more RUM :-)
 
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: uwe@thetaphi...
 
  From: Bill Bell [mailto:billnb...@gmail.com]
  Sent: Friday, July 20, 2012 5:17 AM
  Subject: Re: ...
  s=op using it? The least used memory will be removed from the OS
  automaticall=? Isee some paging. Wouldn't paging slow down the querying?
 
 
  My index is 10gb and every 8 hours we get most of it in shared memory.
 The
  m=mory is 99 percent used, and that does not leave any room for other
  apps. =
 
  Other implications?
 
  Sent from my mobile device
  720-256-8076
 
  On Jul 19, 2012, at 9:49 A...
  H=ap space or free system RAM:
 
  
  
 http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
   l
  
   Uwe
  ...
   use i= since you might run out of memory on large indexes right?
 
  
   Here is how I got iSimpleFSDirectoryFactory to work. Just set -
   Dsolr.directoryFactor...
   set it=all up with a helper in solrconfig.xml...
 
  
   if (Constants.WINDOWS) {
   if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64...



 --
 Lance Norskog
 goks...@gmail.com



RE: RE: How to setup SimpleFSDirectoryFactory

2012-07-22 Thread Uwe Schindler
Hi,

It seems that both of you simply don't understand what's happening in your
operating system kernel. Please read the blog post again!

 It happens in 3.6, for this reasons I thought of moving to solandra.
 If I do a commit, the all documents are persisted with out any issues.
 There is no issues  in terms of any functionality, but only this happens
is
 increase in physical RAM, goes higher and higher and stop at maximum and
it
 never comes down.

This is perfectly fine in Windows and Linux (and any other operating
system). If an operating system would not use *all* available physical
memory it would waste costly hardware resources. Why not use resources that
are unused otherwise? As said before:

O/S kernel uses *all* available physical RAM for caching file system
accesses. The memory used for that is always reported as not free, because
it is used (very simple, right?). But if some other application wants to use
it, its free for malloc(), so it is not permanently occupied. That's always
that case, using MMapDirectory or not (same for SimpleFSDirectory or
NIOFSDirectory).

Of course, when you freshly booted your kernel, it reports free memory, but
definitely not on a server running 24/7 since weeks.

For all people who don't want to understand that, here is the easy
explanation page:
http://www.linuxatemyram.com/

   all my physical memory say its 100 percentage used(windows). On deep
   investigation found that mmap is not releasing os files handles. Do
   you find this behaviour?

One comment: The file handles are not freed as long as the index is open.
Used file handles have nothing to do with memory mapping, it's completely
unrelated to each other.

Uwe

 On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog goks...@gmail.com wrote:
 
  Interesting. Which version of Solr is this? What happens if you do a
  commit?
 
  On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
  anjaliprabh...@gmail.com
  wrote:
   Hi uwe,
   Great to know. We have files indexing 1/min. After 30 mins I see
   all my physical memory say its 100 percentage used(windows). On deep
   investigation found that mmap is not releasing os files handles. Do
   you find this behaviour?
  
   Thanks
  
   On 20 Jul 2012 14:04, Uwe Schindler u...@thetaphi.de wrote:
  
   Hi Bill,
  
   MMapDirectory uses the file system cache of your operating system,
   which
  has
   following consequences: In Linux, top  free should normally report
   only
   *few* free memory, because the O/S uses all memory not allocated by
   applications to cache disk I/O (and shows it as allocated, so having
   0%
  free
   memory is just normal on Linux and also Windows). If you have other
   applications or Lucene/Solr itself that allocate lot's of heap space
   or
   malloc() a lot, then you are reducing free physical memory, so
   reducing
  fs
   cache. This depends also on your swappiness parameter (if swappiness
   is higher, inactive processes are swapped out easier, default is 60%
   on
  linux -
   freeing more space for FS cache - the backside is of course that
   maybe in-memory structures of Lucene and other applications get pages
 out).
  
   You will only see no paging at all if all memory allocated all
  applications
   + all mmapped files fit into memory. But paging in/out the mmapped
   + Lucene
   index is much cheaper than using SimpleFSDirectory or
  NIOFSDirectory. If
   you use SimpleFS or NIO and your index is not in FS cache, it will
   also
  read
   it from physical disk again, so where is the difference. Paging is
  actually
   cheaper as no syscalls are involved.
  
   If you want as much as possible of your index in physical RAM, copy
   it to /dev/null regularily and buy more RUM :-)
  
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
   eMail: uwe@thetaphi...
  
   From: Bill Bell [mailto:billnb...@gmail.com]
   Sent: Friday, July 20, 2012 5:17 AM
   Subject: Re: ...
   s=op using it? The least used memory will be removed from the OS
   automaticall=? Isee some paging. Wouldn't paging slow down the
 querying?
  
  
   My index is 10gb and every 8 hours we get most of it in shared
memory.
  The
   m=mory is 99 percent used, and that does not leave any room for
   other
   apps. =
  
   Other implications?
  
   Sent from my mobile device
   720-256-8076
  
   On Jul 19, 2012, at 9:49 A...
   H=ap space or free system RAM:
  
   
   
  http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
l
   
Uwe
   ...
use i= since you might run out of memory on large indexes right?
  
   
Here is how I got iSimpleFSDirectoryFactory to work. Just set -
Dsolr.directoryFactor...
set it=all up with a helper in solrconfig.xml...
  
   
if (Constants.WINDOWS) {
if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64...
 
 
 
  --
  Lance Norskog
  goks...@gmail.com
 



Re: How to setup SimpleFSDirectoryFactory

2012-07-22 Thread Bill Bell
I get a similar situation using Windows 2008 and Solr 3.6. Memory using mmap is 
never released. Even if I turn off traffic and commit and do a manual gc. If 
the size of the index is 3gb then memory used will be heap + 3gb of shared 
used. If I use a 6gb index I get heap + 6gb. If I turn off MMapDirectoryFactory 
it goes back down. When is the MMap supposed to release memory ? It only does 
it on JVM restart now.

Bill Bell
Sent from mobile


On Jul 22, 2012, at 6:21 AM, geetha anjali anjaliprabh...@gmail.com wrote:

 It happens in 3.6, for this reasons I thought of moving to solandra.
 If I do a commit, the all documents are persisted with out any issues.
 There is no issues  in terms of any functionality, but only this happens is
 increase in physical RAM, goes higher and higher and stop at maximum and it
 never comes down.
 
 Thanks
 
 On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog goks...@gmail.com wrote:
 
 Interesting. Which version of Solr is this? What happens if you do a
 commit?
 
 On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali anjaliprabh...@gmail.com
 wrote:
 Hi uwe,
 Great to know. We have files indexing 1/min. After 30 mins I see all
 my physical memory say its 100 percentage used(windows). On deep
 investigation found that mmap is not releasing os files handles. Do you
 find this behaviour?
 
 Thanks
 
 On 20 Jul 2012 14:04, Uwe Schindler u...@thetaphi.de wrote:
 
 Hi Bill,
 
 MMapDirectory uses the file system cache of your operating system, which
 has
 following consequences: In Linux, top  free should normally report only
 *few* free memory, because the O/S uses all memory not allocated by
 applications to cache disk I/O (and shows it as allocated, so having 0%
 free
 memory is just normal on Linux and also Windows). If you have other
 applications or Lucene/Solr itself that allocate lot's of heap space or
 malloc() a lot, then you are reducing free physical memory, so reducing
 fs
 cache. This depends also on your swappiness parameter (if swappiness is
 higher, inactive processes are swapped out easier, default is 60% on
 linux -
 freeing more space for FS cache - the backside is of course that maybe
 in-memory structures of Lucene and other applications get pages out).
 
 You will only see no paging at all if all memory allocated all
 applications
 + all mmapped files fit into memory. But paging in/out the mmapped Lucene
 index is much cheaper than using SimpleFSDirectory or
 NIOFSDirectory. If
 you use SimpleFS or NIO and your index is not in FS cache, it will also
 read
 it from physical disk again, so where is the difference. Paging is
 actually
 cheaper as no syscalls are involved.
 
 If you want as much as possible of your index in physical RAM, copy it to
 /dev/null regularily and buy more RUM :-)
 
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: uwe@thetaphi...
 
 From: Bill Bell [mailto:billnb...@gmail.com]
 Sent: Friday, July 20, 2012 5:17 AM
 Subject: Re: ...
 s=op using it? The least used memory will be removed from the OS
 automaticall=? Isee some paging. Wouldn't paging slow down the querying?
 
 
 My index is 10gb and every 8 hours we get most of it in shared memory.
 The
 m=mory is 99 percent used, and that does not leave any room for other
 apps. =
 
 Other implications?
 
 Sent from my mobile device
 720-256-8076
 
 On Jul 19, 2012, at 9:49 A...
 H=ap space or free system RAM:
 
 
 
 http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
 l
 
 Uwe
 ...
 use i= since you might run out of memory on large indexes right?
 
 
 Here is how I got iSimpleFSDirectoryFactory to work. Just set -
 Dsolr.directoryFactor...
 set it=all up with a helper in solrconfig.xml...
 
 
 if (Constants.WINDOWS) {
 if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64...
 
 
 
 --
 Lance Norskog
 goks...@gmail.com
 


RE: How to setup SimpleFSDirectoryFactory

2012-07-22 Thread Uwe Schindler
It is hopeless to talk to both of you, you don't understand virtual memory:

 I get a similar situation using Windows 2008 and Solr 3.6. Memory using
 mmap=is never released. Even if I turn off traffic and commit and do a
manual
 gc= If the size of the index is 3gb then memory used will be heap + 3gb of
 sha=ed used. If I use a 6gb index I get heap + 6gb. 

That is expected, but we are talking not about allocated physical memory, we
are talking about allocated ADDRESS SPACE and you have 2^47 of that on 64bit
platforms. There is no physical memory wasted or allocated - please read the
blog post a third, forth, fifth... or tenth time, until it is obvious. You
should also go back to school and take a course on system programming and
operating system kernels. Every CS student gets that taught in his first
year (at least in Germany).

Java's GC has nothing to do with that - as long as the index is open,
ADDRESS SPACE is assigned. We are talking not about memory nor Java heap
space.

 If I turn off
 MMapDirectory=actory it goes back down. When is the MMap supposed to
 release memory ? It o=ly does it on JVM restart now.

Can you please stop spreading nonsense about MMapDirectory with no knowledge
behind? http://www.linuxatemyram.com/ - Also applies to Windows.

Uwe

 Bill Bell
 Sent from mobile
 
 
 On Jul 22, 2012, at 6:21 AM, geetha anjali anjaliprabh...@gmail.com
 wrote:=
  It happens in 3.6, for this reasons I thought of moving to solandra.
  If I do a commit, the all documents are persisted with out any issues.
  There is no issues  in terms of any functionality, but only this
  happens i= increase in physical RAM, goes higher and higher and stop
  at maximum and i= never comes down.
 
  Thanks
 
  On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog goks...@gmail.com
 wrote:
 
  Interesting. Which version of Solr is this? What happens if you do a
  commit?
 
  On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
 anjaliprabh...@gmail.com= wrote:
  Hi uwe,
  Great to know. We have files indexing 1/min. After 30 mins I see
  all= my physical memory say its 100 percentage used(windows). On
  deep investigation found that mmap is not releasing os files handles.
Do
 you find this behaviour?
 
  Thanks
 
  On 20 Jul 2012 14:04, Uwe Schindler u...@thetaphi.de wrote:
 
  Hi Bill,
 
  MMapDirectory uses the file system cache of your operating system,
  which= has following consequences: In Linux, top  free should
  normally report only= *few* free memory, because the O/S uses all
  memory not allocated by applications to cache disk I/O (and shows it
  as allocated, so having 0%
  free
  memory is just normal on Linux and also Windows). If you have other
  applications or Lucene/Solr itself that allocate lot's of heap space
  or
  malloc() a lot, then you are reducing free physical memory, so
  reducing
  fs
  cache. This depends also on your swappiness parameter (if swappiness
  is higher, inactive processes are swapped out easier, default is 60%
  on
  linux -
  freeing more space for FS cache - the backside is of course that
  maybe in-memory structures of Lucene and other applications get pages
 out).
 
  You will only see no paging at all if all memory allocated all
  applications
  + all mmapped files fit into memory. But paging in/out the mmapped
  + Lucen=
  index is much cheaper than using SimpleFSDirectory or
  NIOFSDirectory. If
  you use SimpleFS or NIO and your index is not in FS cache, it will
  also
  read
  it from physical disk again, so where is the difference. Paging is
  actually
  cheaper as no syscalls are involved.
 
  If you want as much as possible of your index in physical RAM, copy
  it t= /dev/null regularily and buy more RUM :-)
 
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
  eMail: uwe@thetaphi...
 
  From: Bill Bell [mailto:billnb...@gmail.com]
  Sent: Friday, July 20, 2012 5:17 AM
  Subject: Re: ...
  s=op using it? The least used memory will be removed from the OS
  automaticall=? Isee some paging. Wouldn't paging slow down the
 queryi=g?
 
 
  My index is 10gb and every 8 hours we get most of it in shared
memory.
  The
  m=mory is 99 percent used, and that does not leave any room for
  other= apps. =
 
  Other implications?
 
  Sent from my mobile device
  720-256-8076
 
  On Jul 19, 2012, at 9:49 A...
  H=ap space or free system RAM:
 
 
 
  http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.ht
  m
  l
 
  Uwe
  ...
  use i= since you might run out of memory on large indexes right?
 
 
  Here is how I got iSimpleFSDirectoryFactory to work. Just set -
  Dsolr.directoryFactor...
  set it=all up with a helper in solrconfig.xml...
 
 
  if (Constants.WINDOWS) {
  if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64...
 
 
 
  --
  Lance Norskog
  goks...@gmail.com
 




Re: How to setup SimpleFSDirectoryFactory

2012-07-22 Thread geetha anjali
Hu Uwe,
Thanks Wwe, Have you checked the Bug in JRE for mmapDirectory?. I was
mentioning this, This is posted in Oracle site, and the API doc.
They accept this as a bug, have you seen this?.

“MMapDirectoryhttp://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/store/MMapDirectory.htmluses
memory-mapped IO when reading. This is a good choice if you have
plenty of virtual memory relative to your index size, eg if you are running
on a 64 bit JRE, or you are running on a 32 bit JRE but your index sizes
are small enough to fit into the virtual memory space. Java has currently
the limitation of not being able to unmap files from user code. The files
are unmapped, when GC releases the byte buffers. *Due to this
bughttp://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4724038in
Sun's JRE,
MMapDirectory's
**IndexInput.close()*http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/store/IndexInput.html#close%28%29
* is unable to close the underlying OS file handle. Only when GC finally
collects the underlying objects, which could be quite some time later, will
the file handle be closed*. *This will consume additional transient disk
usage*: on Windows, attempts to delete or overwrite the files will result
in an exception; on other platforms, which typically have a delete on last
close semantics, while such operations will succeed, the bytes are still
consuming space on disk. For many applications this limitation is not a
problem (e.g. if you have plenty of disk space, and you don't rely on
overwriting files on Windows) but it's still an important limitation to be
aware of. This class supplies a (possibly dangerous) workaround mentioned
in the bug report, which may fail on non-Sun JVMs. “


Thanks,


On Mon, Jul 23, 2012 at 4:13 AM, Uwe Schindler u...@thetaphi.de wrote:

 It is hopeless to talk to both of you, you don't understand virtual memory:

  I get a similar situation using Windows 2008 and Solr 3.6. Memory using
  mmap=is never released. Even if I turn off traffic and commit and do a
 manual
  gc= If the size of the index is 3gb then memory used will be heap + 3gb
 of
  sha=ed used. If I use a 6gb index I get heap + 6gb.

 That is expected, but we are talking not about allocated physical memory,
 we
 are talking about allocated ADDRESS SPACE and you have 2^47 of that on
 64bit
 platforms. There is no physical memory wasted or allocated - please read
 the
 blog post a third, forth, fifth... or tenth time, until it is obvious. You
 should also go back to school and take a course on system programming and
 operating system kernels. Every CS student gets that taught in his first
 year (at least in Germany).

 Java's GC has nothing to do with that - as long as the index is open,
 ADDRESS SPACE is assigned. We are talking not about memory nor Java heap
 space.

  If I turn off
  MMapDirectory=actory it goes back down. When is the MMap supposed to
  release memory ? It o=ly does it on JVM restart now.

 Can you please stop spreading nonsense about MMapDirectory with no
 knowledge
 behind? http://www.linuxatemyram.com/ - Also applies to Windows.

 Uwe

  Bill Bell
  Sent from mobile
 
 
  On Jul 22, 2012, at 6:21 AM, geetha anjali anjaliprabh...@gmail.com
  wrote:=
   It happens in 3.6, for this reasons I thought of moving to solandra.
   If I do a commit, the all documents are persisted with out any issues.
   There is no issues  in terms of any functionality, but only this
   happens i= increase in physical RAM, goes higher and higher and stop
   at maximum and i= never comes down.
  
   Thanks
  
   On Sun, Jul 22, 2012 at 3:38 AM, Lance Norskog goks...@gmail.com
  wrote:
  
   Interesting. Which version of Solr is this? What happens if you do a
   commit?
  
   On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali
  anjaliprabh...@gmail.com= wrote:
   Hi uwe,
   Great to know. We have files indexing 1/min. After 30 mins I see
   all= my physical memory say its 100 percentage used(windows). On
   deep investigation found that mmap is not releasing os files handles.
 Do
  you find this behaviour?
  
   Thanks
  
   On 20 Jul 2012 14:04, Uwe Schindler u...@thetaphi.de wrote:
  
   Hi Bill,
  
   MMapDirectory uses the file system cache of your operating system,
   which= has following consequences: In Linux, top  free should
   normally report only= *few* free memory, because the O/S uses all
   memory not allocated by applications to cache disk I/O (and shows it
   as allocated, so having 0%
   free
   memory is just normal on Linux and also Windows). If you have other
   applications or Lucene/Solr itself that allocate lot's of heap space
   or
   malloc() a lot, then you are reducing free physical memory, so
   reducing
   fs
   cache. This depends also on your swappiness parameter (if swappiness
   is higher, inactive processes are swapped out easier, default is 60%
   on
   linux -
   freeing more space for FS cache - the backside is of course that
   maybe in-memory structures of Lucene and other 

Re: RE: How to setup SimpleFSDirectoryFactory

2012-07-21 Thread geetha anjali
Hi uwe,
Great to know. We have files indexing 1/min. After 30 mins I see all
my physical memory say its 100 percentage used(windows). On deep
investigation found that mmap is not releasing os files handles. Do you
find this behaviour?

Thanks

On 20 Jul 2012 14:04, Uwe Schindler u...@thetaphi.de wrote:

Hi Bill,

MMapDirectory uses the file system cache of your operating system, which has
following consequences: In Linux, top  free should normally report only
*few* free memory, because the O/S uses all memory not allocated by
applications to cache disk I/O (and shows it as allocated, so having 0% free
memory is just normal on Linux and also Windows). If you have other
applications or Lucene/Solr itself that allocate lot's of heap space or
malloc() a lot, then you are reducing free physical memory, so reducing fs
cache. This depends also on your swappiness parameter (if swappiness is
higher, inactive processes are swapped out easier, default is 60% on linux -
freeing more space for FS cache - the backside is of course that maybe
in-memory structures of Lucene and other applications get pages out).

You will only see no paging at all if all memory allocated all applications
+ all mmapped files fit into memory. But paging in/out the mmapped Lucene
index is much cheaper than using SimpleFSDirectory or NIOFSDirectory. If
you use SimpleFS or NIO and your index is not in FS cache, it will also read
it from physical disk again, so where is the difference. Paging is actually
cheaper as no syscalls are involved.

If you want as much as possible of your index in physical RAM, copy it to
/dev/null regularily and buy more RUM :-)


-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi...

 From: Bill Bell [mailto:billnb...@gmail.com]
 Sent: Friday, July 20, 2012 5:17 AM
 Subject: Re: ...
 s=op using it? The least used memory will be removed from the OS
 automaticall=? Isee some paging. Wouldn't paging slow down the querying?


 My index is 10gb and every 8 hours we get most of it in shared memory. The
 m=mory is 99 percent used, and that does not leave any room for other
apps. =

 Other implications?

 Sent from my mobile device
 720-256-8076

 On Jul 19, 2012, at 9:49 A...
 H=ap space or free system RAM:

 
  http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
  l
 
  Uwe
 ...
  use i= since you might run out of memory on large indexes right?

 
  Here is how I got iSimpleFSDirectoryFactory to work. Just set -
  Dsolr.directoryFactor...
  set it=all up with a helper in solrconfig.xml...

 
  if (Constants.WINDOWS) {
  if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64...


Re: RE: How to setup SimpleFSDirectoryFactory

2012-07-21 Thread Lance Norskog
Interesting. Which version of Solr is this? What happens if you do a commit?

On Sat, Jul 21, 2012 at 8:01 AM, geetha anjali anjaliprabh...@gmail.com wrote:
 Hi uwe,
 Great to know. We have files indexing 1/min. After 30 mins I see all
 my physical memory say its 100 percentage used(windows). On deep
 investigation found that mmap is not releasing os files handles. Do you
 find this behaviour?

 Thanks

 On 20 Jul 2012 14:04, Uwe Schindler u...@thetaphi.de wrote:

 Hi Bill,

 MMapDirectory uses the file system cache of your operating system, which has
 following consequences: In Linux, top  free should normally report only
 *few* free memory, because the O/S uses all memory not allocated by
 applications to cache disk I/O (and shows it as allocated, so having 0% free
 memory is just normal on Linux and also Windows). If you have other
 applications or Lucene/Solr itself that allocate lot's of heap space or
 malloc() a lot, then you are reducing free physical memory, so reducing fs
 cache. This depends also on your swappiness parameter (if swappiness is
 higher, inactive processes are swapped out easier, default is 60% on linux -
 freeing more space for FS cache - the backside is of course that maybe
 in-memory structures of Lucene and other applications get pages out).

 You will only see no paging at all if all memory allocated all applications
 + all mmapped files fit into memory. But paging in/out the mmapped Lucene
 index is much cheaper than using SimpleFSDirectory or NIOFSDirectory. If
 you use SimpleFS or NIO and your index is not in FS cache, it will also read
 it from physical disk again, so where is the difference. Paging is actually
 cheaper as no syscalls are involved.

 If you want as much as possible of your index in physical RAM, copy it to
 /dev/null regularily and buy more RUM :-)


 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: uwe@thetaphi...

 From: Bill Bell [mailto:billnb...@gmail.com]
 Sent: Friday, July 20, 2012 5:17 AM
 Subject: Re: ...
 s=op using it? The least used memory will be removed from the OS
 automaticall=? Isee some paging. Wouldn't paging slow down the querying?


 My index is 10gb and every 8 hours we get most of it in shared memory. The
 m=mory is 99 percent used, and that does not leave any room for other
 apps. =

 Other implications?

 Sent from my mobile device
 720-256-8076

 On Jul 19, 2012, at 9:49 A...
 H=ap space or free system RAM:

 
  http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
  l
 
  Uwe
 ...
  use i= since you might run out of memory on large indexes right?

 
  Here is how I got iSimpleFSDirectoryFactory to work. Just set -
  Dsolr.directoryFactor...
  set it=all up with a helper in solrconfig.xml...

 
  if (Constants.WINDOWS) {
  if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64...



-- 
Lance Norskog
goks...@gmail.com


RE: How to setup SimpleFSDirectoryFactory

2012-07-20 Thread Uwe Schindler
Hi Bill,

MMapDirectory uses the file system cache of your operating system, which has
following consequences: In Linux, top  free should normally report only
*few* free memory, because the O/S uses all memory not allocated by
applications to cache disk I/O (and shows it as allocated, so having 0% free
memory is just normal on Linux and also Windows). If you have other
applications or Lucene/Solr itself that allocate lot's of heap space or
malloc() a lot, then you are reducing free physical memory, so reducing fs
cache. This depends also on your swappiness parameter (if swappiness is
higher, inactive processes are swapped out easier, default is 60% on linux -
freeing more space for FS cache - the backside is of course that maybe
in-memory structures of Lucene and other applications get pages out).

You will only see no paging at all if all memory allocated all applications
+ all mmapped files fit into memory. But paging in/out the mmapped Lucene
index is much cheaper than using SimpleFSDirectory or NIOFSDirectory. If
you use SimpleFS or NIO and your index is not in FS cache, it will also read
it from physical disk again, so where is the difference. Paging is actually
cheaper as no syscalls are involved.

If you want as much as possible of your index in physical RAM, copy it to
/dev/null regularily and buy more RUM :-)

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Bill Bell [mailto:billnb...@gmail.com]
 Sent: Friday, July 20, 2012 5:17 AM
 Subject: Re: How to setup SimpleFSDirectoryFactory
 
 Thanks. Are you saying that if we run low on memory, the MMapDirectory
will
 s=op using it? The least used memory will be removed from the OS
 automaticall=? Isee some paging. Wouldn't paging slow down the querying?
 
 My index is 10gb and every 8 hours we get most of it in shared memory. The
 m=mory is 99 percent used, and that does not leave any room for other
apps. =
 Other implications?
 
 Sent from my mobile device
 720-256-8076
 
 On Jul 19, 2012, at 9:49 AM, Uwe Schindler u...@thetaphi.de wrote:
 
  Read this, then you will see that MMapDirectory will use 0% of your Java
 H=ap space or free system RAM:
 
  http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.htm
  l
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: William Bell [mailto:billnb...@gmail.com]
  Sent: Tuesday, July 17, 2012 6:05 AM
  Subject: How to setup SimpleFSDirectoryFactory
 
  We all know that MMapDirectory is fastest. However we cannot always
  use i= since you might run out of memory on large indexes right?
 
  Here is how I got iSimpleFSDirectoryFactory to work. Just set -
  Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.
 
  Your solrconfig.xml:
 
  directoryFactory name=DirectoryFactory
  class=${solr.directoryFactory:solr.StandardDirectoryFactory}/
 
  You can check it with http://localhost:8983/solr/admin/stats.jsp
 
  Notice that the default for Windows 64bit is MMapDirectory. Else
  NIOFSDirectory except for WIndows It would be nicer if we just
  set it=all up with a helper in solrconfig.xml...
 
  if (Constants.WINDOWS) {
  if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64BIT)
 return new MMapDirectory(path, lockFactory);
  else
 return new SimpleFSDirectory(path, lockFactory);
  } else {
 return new NIOFSDirectory(path, lockFactory);
   }
  }
 
 
 
  --
  Bill Bell
  billnb...@gmail.com
  cell 720-256-8076
 
 




RE: How to setup SimpleFSDirectoryFactory

2012-07-19 Thread Uwe Schindler
Read this, then you will see that MMapDirectory will use 0% of your Java Heap 
space or free system RAM:

http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: William Bell [mailto:billnb...@gmail.com]
 Sent: Tuesday, July 17, 2012 6:05 AM
 Subject: How to setup SimpleFSDirectoryFactory
 
 We all know that MMapDirectory is fastest. However we cannot always use it
 since you might run out of memory on large indexes right?
 
 Here is how I got iSimpleFSDirectoryFactory to work. Just set -
 Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.
 
 Your solrconfig.xml:
 
 directoryFactory name=DirectoryFactory
 class=${solr.directoryFactory:solr.StandardDirectoryFactory}/
 
 You can check it with http://localhost:8983/solr/admin/stats.jsp
 
 Notice that the default for Windows 64bit is MMapDirectory. Else
 NIOFSDirectory except for WIndows It would be nicer if we just set it all 
 up
 with a helper in solrconfig.xml...
 
 if (Constants.WINDOWS) {
  if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64BIT)
 return new MMapDirectory(path, lockFactory);
  else
 return new SimpleFSDirectory(path, lockFactory);
  } else {
 return new NIOFSDirectory(path, lockFactory);
   }
 }
 
 
 
 --
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076




Re: How to setup SimpleFSDirectoryFactory

2012-07-19 Thread Bill Bell
Thanks. Are you saying that if we run low on memory, the MMapDirectory will 
stop using it? The least used memory will be removed from the OS automatically? 
Isee some paging. Wouldn't paging slow down the querying?

My index is 10gb and every 8 hours we get most of it in shared memory. The 
memory is 99 percent used, and that does not leave any room for other apps. 

Other implications?

Sent from my mobile device
720-256-8076

On Jul 19, 2012, at 9:49 AM, Uwe Schindler u...@thetaphi.de wrote:

 Read this, then you will see that MMapDirectory will use 0% of your Java Heap 
 space or free system RAM:
 
 http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
 
 Uwe
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 -Original Message-
 From: William Bell [mailto:billnb...@gmail.com]
 Sent: Tuesday, July 17, 2012 6:05 AM
 Subject: How to setup SimpleFSDirectoryFactory
 
 We all know that MMapDirectory is fastest. However we cannot always use it
 since you might run out of memory on large indexes right?
 
 Here is how I got iSimpleFSDirectoryFactory to work. Just set -
 Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.
 
 Your solrconfig.xml:
 
 directoryFactory name=DirectoryFactory
 class=${solr.directoryFactory:solr.StandardDirectoryFactory}/
 
 You can check it with http://localhost:8983/solr/admin/stats.jsp
 
 Notice that the default for Windows 64bit is MMapDirectory. Else
 NIOFSDirectory except for WIndows It would be nicer if we just set it 
 all up
 with a helper in solrconfig.xml...
 
 if (Constants.WINDOWS) {
 if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64BIT)
return new MMapDirectory(path, lockFactory);
 else
return new SimpleFSDirectory(path, lockFactory);
 } else {
return new NIOFSDirectory(path, lockFactory);
  }
 }
 
 
 
 --
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076
 
 


How to setup SimpleFSDirectoryFactory

2012-07-16 Thread William Bell
We all know that MMapDirectory is fastest. However we cannot always
use it since you might run out of memory on large indexes right?

Here is how I got iSimpleFSDirectoryFactory to work. Just set
-Dsolr.directoryFactory=solr.SimpleFSDirectoryFactory.

Your solrconfig.xml:

directoryFactory name=DirectoryFactory
class=${solr.directoryFactory:solr.StandardDirectoryFactory}/

You can check it with http://localhost:8983/solr/admin/stats.jsp

Notice that the default for Windows 64bit is MMapDirectory. Else
NIOFSDirectory except for WIndows It would be nicer if we just set
it all up with a helper in solrconfig.xml...

if (Constants.WINDOWS) {
 if (MMapDirectory.UNMAP_SUPPORTED  Constants.JRE_IS_64BIT)
return new MMapDirectory(path, lockFactory);
 else
return new SimpleFSDirectory(path, lockFactory);
 } else {
return new NIOFSDirectory(path, lockFactory);
  }
}



-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076