Re: how do lucene read large index files?

2016-11-29 Thread Kumaran Ramasubramanian
Thanks Mike. We are planning to move  MMapDirectory in both indexing and
searching.Regarding ulimit change and read during merging, i just tried
to know the impact of mmapdir during indexing.

-
Kumaran R


On Nov 30, 2016 4:18 AM, "Michael McCandless" <luc...@mikemccandless.com>
wrote:
>
> It's OK to use NIOFSDirectory for indexing only in that nothing will
break.
>
> But, MMapDirectory already uses normal IO for writing
> (java.io.FileOutputStream), and indexing does sometimes need to to
> read (for merging segments) though that's largely sequential reading
> so perhaps NIOFSDirectory won't be much slower.
>
> Why not use MMapDirectory for both indexing and searching?
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Mon, Nov 28, 2016 at 7:20 AM, Kumaran Ramasubramanian
> <kums@gmail.com> wrote:
> > Thanks a lot Uwe!!! Do we get any benefit on using MMapDirectory over
> > NIOFSDir during indexing? During merging? Is it ok to change to
> > MMapDirectory during search alone?
> >
> > --
> > Kumaran R
> >
> >
> > On Nov 24, 2016 11:27 PM, "Erick Erickson" <erickerick...@gmail.com>
wrote:
> >>
> >> Thanks Uwe!
> >>
> >>
> >>
> >>
> >> On Thu, Nov 24, 2016 at 9:41 AM, Uwe Schindler <u...@thetaphi.de> wrote:
> >> > Hi Kumaran, hi Erick,
> >> >
> >> >> Not really, as I don't know that code well, Uwe and company
> >> >> are the masters of that realm ;)
> >> >>
> >> >> Sorry I can't be more help there
> >> >
> >> > I can help!
> >> >
> >> >> On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian
> >> >> <kums@gmail.com> wrote:
> >> >> > Erick, Thanks a lot for sharing an excellent post...
> >> >> >
> >> >> > Btw, am using NIOFSDirectory, could you please elaborate on below
> >> >> mentioned
> >> >> > lines? or any further pointers?
> >> >> > NIOFSDirectory or SimpleFSDirectory, we have to pay another price:
> > Our
> >> >> code
> >> >> >> has to do a lot of syscalls to the O/S kernel to copy blocks of
data
> >> >> >> between the disk or filesystem cache and our buffers residing in
> > Java
> >> >> heap.
> >> >> >> This needs to be done on every search request, over and over
again.
> >> >
> >> > the blog post just says it simple: You should use MMapDirectory and
> > avoid SimpleFSDir or MMapDirectory! The blog post explains why:
SimpleFSDir
> > and NIOFSDir extend BufferedIndexInput. This class uses an on-heap
buffer
> > for reading index files (which is 16 KB). For some parts of the index
(like
> > doc values), this is not ideal. E.g. if you sort against a doc values
field
> > and it needs to access a sort value (e.g. a short, integer or byte,
which
> > is very small), it will ask the buffer for the like 4 bytes. In most
cases
> > when sorting the buffer will not contain those byte, as sorting requires
> > random access over a huge file (so it is unlikely that the buffer will
> > help). Then BufferedIndexInput will seek the NIO/Simple file pointer and
> > read 16 KiB into the buffer. This requires a syscall to the OS kernel,
> > which is expensive. During sorting search results this can be millions
or
> > billions of times. In addition it will copy chunks of memory between
Java
> > heap and operating system cache over and over.
> >> >
> >> > With MMapDirectory no buffering is done, the Lucene code directly
> > accesses the file system cache and this is much more optimized.
> >> >
> >> > So for fast index access:
> >> > - avoid SimpleFSDir or NIOFSDir (those are only there for legacy 32
bit
> > operating systems and JVMs)
> >> > - configure your operating system kernel as described in the blog
post
> > and use MMapDirectory
> >> > - tell the sysadmin to inform himself about the output of linux
> > commands free/top/... (or Windows complements).
> >> >
> >> > Uwe
> >> >
> >> >> > --
> >> >> > Kumaran R
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson
> >> >> <erickerick...@gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> >> see Uwe's blog:
> >> &

Re: how do lucene read large index files?

2016-11-29 Thread Michael McCandless
It's OK to use NIOFSDirectory for indexing only in that nothing will break.

But, MMapDirectory already uses normal IO for writing
(java.io.FileOutputStream), and indexing does sometimes need to to
read (for merging segments) though that's largely sequential reading
so perhaps NIOFSDirectory won't be much slower.

Why not use MMapDirectory for both indexing and searching?
Mike McCandless

http://blog.mikemccandless.com


On Mon, Nov 28, 2016 at 7:20 AM, Kumaran Ramasubramanian
<kums@gmail.com> wrote:
> Thanks a lot Uwe!!! Do we get any benefit on using MMapDirectory over
> NIOFSDir during indexing? During merging? Is it ok to change to
> MMapDirectory during search alone?
>
> --
> Kumaran R
>
>
> On Nov 24, 2016 11:27 PM, "Erick Erickson" <erickerick...@gmail.com> wrote:
>>
>> Thanks Uwe!
>>
>>
>>
>>
>> On Thu, Nov 24, 2016 at 9:41 AM, Uwe Schindler <u...@thetaphi.de> wrote:
>> > Hi Kumaran, hi Erick,
>> >
>> >> Not really, as I don't know that code well, Uwe and company
>> >> are the masters of that realm ;)
>> >>
>> >> Sorry I can't be more help there
>> >
>> > I can help!
>> >
>> >> On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian
>> >> <kums@gmail.com> wrote:
>> >> > Erick, Thanks a lot for sharing an excellent post...
>> >> >
>> >> > Btw, am using NIOFSDirectory, could you please elaborate on below
>> >> mentioned
>> >> > lines? or any further pointers?
>> >> > NIOFSDirectory or SimpleFSDirectory, we have to pay another price:
> Our
>> >> code
>> >> >> has to do a lot of syscalls to the O/S kernel to copy blocks of data
>> >> >> between the disk or filesystem cache and our buffers residing in
> Java
>> >> heap.
>> >> >> This needs to be done on every search request, over and over again.
>> >
>> > the blog post just says it simple: You should use MMapDirectory and
> avoid SimpleFSDir or MMapDirectory! The blog post explains why: SimpleFSDir
> and NIOFSDir extend BufferedIndexInput. This class uses an on-heap buffer
> for reading index files (which is 16 KB). For some parts of the index (like
> doc values), this is not ideal. E.g. if you sort against a doc values field
> and it needs to access a sort value (e.g. a short, integer or byte, which
> is very small), it will ask the buffer for the like 4 bytes. In most cases
> when sorting the buffer will not contain those byte, as sorting requires
> random access over a huge file (so it is unlikely that the buffer will
> help). Then BufferedIndexInput will seek the NIO/Simple file pointer and
> read 16 KiB into the buffer. This requires a syscall to the OS kernel,
> which is expensive. During sorting search results this can be millions or
> billions of times. In addition it will copy chunks of memory between Java
> heap and operating system cache over and over.
>> >
>> > With MMapDirectory no buffering is done, the Lucene code directly
> accesses the file system cache and this is much more optimized.
>> >
>> > So for fast index access:
>> > - avoid SimpleFSDir or NIOFSDir (those are only there for legacy 32 bit
> operating systems and JVMs)
>> > - configure your operating system kernel as described in the blog post
> and use MMapDirectory
>> > - tell the sysadmin to inform himself about the output of linux
> commands free/top/... (or Windows complements).
>> >
>> > Uwe
>> >
>> >> > --
>> >> > Kumaran R
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson
>> >> <erickerick...@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> see Uwe's blog:
>> >> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-
>> >> 64bit.html
>> >> >>
>> >> >> Short form: files are read into the OS's memory as needed. the whole
>> >> >> file isn't read at once.
>> >> >>
>> >> >> Best,
>> >> >> Erick
>> >> >>
>> >> >> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
>> >> >> <kums@gmail.com> wrote:
>> >> >> > Hi All,
>> >> >> >
>> >> >> > how do lucene read large index files?
>> >> >> > for example, if one file (for eg: .dat file) is 4GB.
>>

Re: how do lucene read large index files?

2016-11-28 Thread Kumaran Ramasubramanian
Thanks a lot Uwe!!! Do we get any benefit on using MMapDirectory over
NIOFSDir during indexing? During merging? Is it ok to change to
MMapDirectory during search alone?

--
Kumaran R


On Nov 24, 2016 11:27 PM, "Erick Erickson" <erickerick...@gmail.com> wrote:
>
> Thanks Uwe!
>
>
>
>
> On Thu, Nov 24, 2016 at 9:41 AM, Uwe Schindler <u...@thetaphi.de> wrote:
> > Hi Kumaran, hi Erick,
> >
> >> Not really, as I don't know that code well, Uwe and company
> >> are the masters of that realm ;)
> >>
> >> Sorry I can't be more help there
> >
> > I can help!
> >
> >> On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian
> >> <kums@gmail.com> wrote:
> >> > Erick, Thanks a lot for sharing an excellent post...
> >> >
> >> > Btw, am using NIOFSDirectory, could you please elaborate on below
> >> mentioned
> >> > lines? or any further pointers?
> >> > NIOFSDirectory or SimpleFSDirectory, we have to pay another price:
Our
> >> code
> >> >> has to do a lot of syscalls to the O/S kernel to copy blocks of data
> >> >> between the disk or filesystem cache and our buffers residing in
Java
> >> heap.
> >> >> This needs to be done on every search request, over and over again.
> >
> > the blog post just says it simple: You should use MMapDirectory and
avoid SimpleFSDir or MMapDirectory! The blog post explains why: SimpleFSDir
and NIOFSDir extend BufferedIndexInput. This class uses an on-heap buffer
for reading index files (which is 16 KB). For some parts of the index (like
doc values), this is not ideal. E.g. if you sort against a doc values field
and it needs to access a sort value (e.g. a short, integer or byte, which
is very small), it will ask the buffer for the like 4 bytes. In most cases
when sorting the buffer will not contain those byte, as sorting requires
random access over a huge file (so it is unlikely that the buffer will
help). Then BufferedIndexInput will seek the NIO/Simple file pointer and
read 16 KiB into the buffer. This requires a syscall to the OS kernel,
which is expensive. During sorting search results this can be millions or
billions of times. In addition it will copy chunks of memory between Java
heap and operating system cache over and over.
> >
> > With MMapDirectory no buffering is done, the Lucene code directly
accesses the file system cache and this is much more optimized.
> >
> > So for fast index access:
> > - avoid SimpleFSDir or NIOFSDir (those are only there for legacy 32 bit
operating systems and JVMs)
> > - configure your operating system kernel as described in the blog post
and use MMapDirectory
> > - tell the sysadmin to inform himself about the output of linux
commands free/top/... (or Windows complements).
> >
> > Uwe
> >
> >> > --
> >> > Kumaran R
> >> >
> >> >
> >> >
> >> > On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson
> >> <erickerick...@gmail.com>
> >> > wrote:
> >> >
> >> >> see Uwe's blog:
> >> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-
> >> 64bit.html
> >> >>
> >> >> Short form: files are read into the OS's memory as needed. the whole
> >> >> file isn't read at once.
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
> >> >> <kums@gmail.com> wrote:
> >> >> > Hi All,
> >> >> >
> >> >> > how do lucene read large index files?
> >> >> > for example, if one file (for eg: .dat file) is 4GB.
> >> >> > lucene read only part of file to RAM? or
> >> >> > is it different approach for different lucene file formats?
> >> >> >
> >> >> >
> >> >> > Related Link:
> >> >> > How do applications (and OS) handle very big files?
> >> >> > http://superuser.com/a/361201
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Kumaran R
> >> >>
> >> >>
-
> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >> >>
> >> >>
> >>
> >> -
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>


Re: how do lucene read large index files?

2016-11-24 Thread Erick Erickson
Thanks Uwe!




On Thu, Nov 24, 2016 at 9:41 AM, Uwe Schindler <u...@thetaphi.de> wrote:
> Hi Kumaran, hi Erick,
>
>> Not really, as I don't know that code well, Uwe and company
>> are the masters of that realm ;)
>>
>> Sorry I can't be more help there
>
> I can help!
>
>> On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian
>> <kums@gmail.com> wrote:
>> > Erick, Thanks a lot for sharing an excellent post...
>> >
>> > Btw, am using NIOFSDirectory, could you please elaborate on below
>> mentioned
>> > lines? or any further pointers?
>> > NIOFSDirectory or SimpleFSDirectory, we have to pay another price: Our
>> code
>> >> has to do a lot of syscalls to the O/S kernel to copy blocks of data
>> >> between the disk or filesystem cache and our buffers residing in Java
>> heap.
>> >> This needs to be done on every search request, over and over again.
>
> the blog post just says it simple: You should use MMapDirectory and avoid 
> SimpleFSDir or MMapDirectory! The blog post explains why: SimpleFSDir and 
> NIOFSDir extend BufferedIndexInput. This class uses an on-heap buffer for 
> reading index files (which is 16 KB). For some parts of the index (like doc 
> values), this is not ideal. E.g. if you sort against a doc values field and 
> it needs to access a sort value (e.g. a short, integer or byte, which is very 
> small), it will ask the buffer for the like 4 bytes. In most cases when 
> sorting the buffer will not contain those byte, as sorting requires random 
> access over a huge file (so it is unlikely that the buffer will help). Then 
> BufferedIndexInput will seek the NIO/Simple file pointer and read 16 KiB into 
> the buffer. This requires a syscall to the OS kernel, which is expensive. 
> During sorting search results this can be millions or billions of times. In 
> addition it will copy chunks of memory between Java heap and operating system 
> cache over and over.
>
> With MMapDirectory no buffering is done, the Lucene code directly accesses 
> the file system cache and this is much more optimized.
>
> So for fast index access:
> - avoid SimpleFSDir or NIOFSDir (those are only there for legacy 32 bit 
> operating systems and JVMs)
> - configure your operating system kernel as described in the blog post and 
> use MMapDirectory
> - tell the sysadmin to inform himself about the output of linux commands 
> free/top/... (or Windows complements).
>
> Uwe
>
>> > --
>> > Kumaran R
>> >
>> >
>> >
>> > On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson
>> <erickerick...@gmail.com>
>> > wrote:
>> >
>> >> see Uwe's blog:
>> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-
>> 64bit.html
>> >>
>> >> Short form: files are read into the OS's memory as needed. the whole
>> >> file isn't read at once.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
>> >> <kums@gmail.com> wrote:
>> >> > Hi All,
>> >> >
>> >> > how do lucene read large index files?
>> >> > for example, if one file (for eg: .dat file) is 4GB.
>> >> > lucene read only part of file to RAM? or
>> >> > is it different approach for different lucene file formats?
>> >> >
>> >> >
>> >> > Related Link:
>> >> > How do applications (and OS) handle very big files?
>> >> > http://superuser.com/a/361201
>> >> >
>> >> >
>> >> > --
>> >> > Kumaran R
>> >>
>> >> -
>> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
>> >>
>> >>
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



RE: how do lucene read large index files?

2016-11-24 Thread Uwe Schindler
Hi Kumaran, hi Erick,

> Not really, as I don't know that code well, Uwe and company
> are the masters of that realm ;)
> 
> Sorry I can't be more help there

I can help!

> On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian
> <kums@gmail.com> wrote:
> > Erick, Thanks a lot for sharing an excellent post...
> >
> > Btw, am using NIOFSDirectory, could you please elaborate on below
> mentioned
> > lines? or any further pointers?
> > NIOFSDirectory or SimpleFSDirectory, we have to pay another price: Our
> code
> >> has to do a lot of syscalls to the O/S kernel to copy blocks of data
> >> between the disk or filesystem cache and our buffers residing in Java
> heap.
> >> This needs to be done on every search request, over and over again.

the blog post just says it simple: You should use MMapDirectory and avoid 
SimpleFSDir or MMapDirectory! The blog post explains why: SimpleFSDir and 
NIOFSDir extend BufferedIndexInput. This class uses an on-heap buffer for 
reading index files (which is 16 KB). For some parts of the index (like doc 
values), this is not ideal. E.g. if you sort against a doc values field and it 
needs to access a sort value (e.g. a short, integer or byte, which is very 
small), it will ask the buffer for the like 4 bytes. In most cases when sorting 
the buffer will not contain those byte, as sorting requires random access over 
a huge file (so it is unlikely that the buffer will help). Then 
BufferedIndexInput will seek the NIO/Simple file pointer and read 16 KiB into 
the buffer. This requires a syscall to the OS kernel, which is expensive. 
During sorting search results this can be millions or billions of times. In 
addition it will copy chunks of memory between Java heap and operating system 
cache over and over.

With MMapDirectory no buffering is done, the Lucene code directly accesses the 
file system cache and this is much more optimized.

So for fast index access:
- avoid SimpleFSDir or NIOFSDir (those are only there for legacy 32 bit 
operating systems and JVMs)
- configure your operating system kernel as described in the blog post and use 
MMapDirectory
- tell the sysadmin to inform himself about the output of linux commands 
free/top/... (or Windows complements).

Uwe

> > --
> > Kumaran R
> >
> >
> >
> > On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson
> <erickerick...@gmail.com>
> > wrote:
> >
> >> see Uwe's blog:
> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-
> 64bit.html
> >>
> >> Short form: files are read into the OS's memory as needed. the whole
> >> file isn't read at once.
> >>
> >> Best,
> >> Erick
> >>
> >> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
> >> <kums@gmail.com> wrote:
> >> > Hi All,
> >> >
> >> > how do lucene read large index files?
> >> > for example, if one file (for eg: .dat file) is 4GB.
> >> > lucene read only part of file to RAM? or
> >> > is it different approach for different lucene file formats?
> >> >
> >> >
> >> > Related Link:
> >> > How do applications (and OS) handle very big files?
> >> > http://superuser.com/a/361201
> >> >
> >> >
> >> > --
> >> > Kumaran R
> >>
> >> -
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: how do lucene read large index files?

2016-11-24 Thread Erick Erickson
Not really, as I don't know that code well, Uwe and company
are the masters of that realm ;)

Sorry I can't be more help there

Erick

On Thu, Nov 24, 2016 at 7:29 AM, Kumaran Ramasubramanian
<kums@gmail.com> wrote:
> Erick, Thanks a lot for sharing an excellent post...
>
> Btw, am using NIOFSDirectory, could you please elaborate on below mentioned
> lines? or any further pointers?
>
> NIOFSDirectory or SimpleFSDirectory, we have to pay another price: Our code
>> has to do a lot of syscalls to the O/S kernel to copy blocks of data
>> between the disk or filesystem cache and our buffers residing in Java heap.
>> This needs to be done on every search request, over and over again.
>
>
>
>
> --
> Kumaran R
>
>
>
> On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> see Uwe's blog:
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>>
>> Short form: files are read into the OS's memory as needed. the whole
>> file isn't read at once.
>>
>> Best,
>> Erick
>>
>> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
>> <kums@gmail.com> wrote:
>> > Hi All,
>> >
>> > how do lucene read large index files?
>> > for example, if one file (for eg: .dat file) is 4GB.
>> > lucene read only part of file to RAM? or
>> > is it different approach for different lucene file formats?
>> >
>> >
>> > Related Link:
>> > How do applications (and OS) handle very big files?
>> > http://superuser.com/a/361201
>> >
>> >
>> > --
>> > Kumaran R
>>
>> -
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: how do lucene read large index files?

2016-11-24 Thread Kumaran Ramasubramanian
Erick, Thanks a lot for sharing an excellent post...

Btw, am using NIOFSDirectory, could you please elaborate on below mentioned
lines? or any further pointers?

NIOFSDirectory or SimpleFSDirectory, we have to pay another price: Our code
> has to do a lot of syscalls to the O/S kernel to copy blocks of data
> between the disk or filesystem cache and our buffers residing in Java heap.
> This needs to be done on every search request, over and over again.




--
Kumaran R



On Wed, Nov 23, 2016 at 9:17 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> see Uwe's blog:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> Short form: files are read into the OS's memory as needed. the whole
> file isn't read at once.
>
> Best,
> Erick
>
> On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
> <kums@gmail.com> wrote:
> > Hi All,
> >
> > how do lucene read large index files?
> > for example, if one file (for eg: .dat file) is 4GB.
> > lucene read only part of file to RAM? or
> > is it different approach for different lucene file formats?
> >
> >
> > Related Link:
> > How do applications (and OS) handle very big files?
> > http://superuser.com/a/361201
> >
> >
> > --
> > Kumaran R
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: how do lucene read large index files?

2016-11-23 Thread Erick Erickson
see Uwe's blog:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Short form: files are read into the OS's memory as needed. the whole
file isn't read at once.

Best,
Erick

On Wed, Nov 23, 2016 at 12:04 AM, Kumaran Ramasubramanian
<kums@gmail.com> wrote:
> Hi All,
>
> how do lucene read large index files?
> for example, if one file (for eg: .dat file) is 4GB.
> lucene read only part of file to RAM? or
> is it different approach for different lucene file formats?
>
>
> Related Link:
> How do applications (and OS) handle very big files?
> http://superuser.com/a/361201
>
>
> --
> Kumaran R

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



how do lucene read large index files?

2016-11-23 Thread Kumaran Ramasubramanian
Hi All,

how do lucene read large index files?
for example, if one file (for eg: .dat file) is 4GB.
lucene read only part of file to RAM? or
is it different approach for different lucene file formats?


Related Link:
How do applications (and OS) handle very big files?
http://superuser.com/a/361201


--
Kumaran R