[jira] Commented: (LUCENE-2816) MMapDirectory speedups

Robert Muir (JIRA) Fri, 17 Dec 2010 01:54:29 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12972422#action_12972422
 ]


Robert Muir commented on LUCENE-2816:
-------------------------------------

{quote}
Good grief! What amazing gains, especially w/ PFor codec which of course makes 
super heavy use of .readInt(). Awesome Robert!
This will mean w/ the cutover to FORPFOR codec for 4.0, MMapDir will likely 
have a huge edge over NIOFSDir?
{quote}

This isn't really a 'gain' for the bulkpostings branch?
This is just making DataInput.readInt() faster.
Currently the bulkpostings branch uses readByte(byte[]), then wraps into a 
ByteBuffer and processes an IntBuffer view of that.
I switched to just using readInt() from DataInputDirectly [FrameOfRefDataInput] 
and found it to be much slower than this IntBuffer method.

this whole benchmark is just benching DataInput.readInt()...

So, we shouldn't change anything in bulkpostings, this isn't faster than the 
intbuffer method in my tests, at best its equivalent... but we should fix this 
slowdown in our APIs.


> MMapDirectory speedups
> ----------------------
>
>                 Key: LUCENE-2816
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2816
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 3.1, 4.0
>            Reporter: Robert Muir
>            Assignee: Robert Muir
>         Attachments: LUCENE-2816.patch
>
>
> MMapDirectory has some performance problems:
> # When the file is larger than Integer.MAX_VALUE, we use MultiMMapIndexInput, 
> which does a lot of unnecessary bounds-checks for its buffer-switching etc. 
> Instead, like MMapIndexInput, it should rely upon the contract of these 
> operations
> in ByteBuffer (which will do a bounds check always and throw 
> BufferUnderflowException).
> Our 'buffer' is so large (Integer.MAX_VALUE) that its rare this happens and 
> doing
> our own bounds checks just slows things down.
> # the readInt()/readLong()/readShort() are slow and should just defer to 
> ByteBuffer.readInt(), etc
> This isn't very important since we don't much use these, but I think there's 
> no reason
> users (e.g. codec writers) should have to readBytes() + wrap as bytebuffer + 
> get an 
> IntBuffer view when readInt() can be almost as fast...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2816) MMapDirectory speedups

Reply via email to