Re: Benchmarks?

jason marshall Mon, 01 Oct 2007 10:46:21 -0700

The way 'buffered I/O' is applied in Java gets converted into Block
I/O before it hits the operating system (after which the OS does its
own buffering).  The primary advantages over straight byte-wise I/O is
the reduced function call overhead (user-space calls, and most
especially system calls).


Direct Block I/O is always a little bit faster than streaming I/O, but
you pay for this with reduced abstraction.  I can't speak to the other
uses in the code, but it so happens that in the digest case, there is
enough control over the I/O streams that you don't require the
abstraction to maintain a sane calling convention.  One method does
the bulk of the copying work, and one could easily directly
allocate/retrieve a buffer here.

My experience has been that Digest algorithm implementations seem to
perform a bit better with larger block sizes, because they are
typically implemented with too many method dispatches between the
calling code and the actual update() code in the core of the digest
algorithm.  Once you get on obscure hardware/implementations like you
mention below, the discrepancies only become more pronounced.

So not only would I encourage you to go to direct byte arrays, I would
encourage you to make it configurable, because folks will find that
different JVM revisions have a different 'sweet spot'.  And if you're
doing any sort of large file digesting, even a few percent can make a
notable difference.

-Jason

On 9/27/07, Raul Benito <[EMAIL PROTECTED]> wrote:
> Hi Jason,
> I understand your concerns, and I think you are right stating that in1.6
> this optimization is unneeded and even can be a small pesimization (I don't
> think it so extreme but I don't have data to back-up). But I have numbers
> that show that 1.4 is a nice optimization and if you go to j2me it is even
> better. As we still have people with 1.4 JVM (we have 1.3 compatibility yet)
> I think we should keep this small hack.
>
> But anyway I'm more than open to find a common path that works well in new
> and old machines. And also I think there are more paths to optimize.
>
> Regards,
>
> Raul
>
>
> On 9/27/07, jason marshall <[EMAIL PROTECTED]> wrote:
> > I know this is an old thread, but I thought I would share what I
> > found, since I can't share code.
> >
> >
> > The UnsynchBufferedOutputStream usage in calculateDigest seems
> > superfluous to me.  I think if you look carefully at
> > XMLSignatureInput, you'll find that all but one path is using block
> > I/O already.  Doubling up block and buffered I/O just makes heavy
> > lifting paths slower, not faster, even with the shortcuts for large
> > blocks.
> >
> > More generally, I think the utility of UnsynchBufferedOutputStream is
> > essentially less than zero at this point.  I suspect you'll find that
> > under JDK 1.5 or later, you can't perceive a noteworthy difference
> > between this class and BufferedInputStream, when used as it is used
> > here.  I wouldn't be shocked to discover that you're actually getting
> > worse performance than BufferedOutputStream, having hurt your
> > performance by bifurcating the code paths that Hotspot has to
> > investigate.  Only under JDK 1.4 does this code still genuinely
> > accomplish something, and I'm not convinced the added complexity is
> > worth that gain.
> >
> > -Jason
> >
> > On 4/9/07, Raul Benito <[EMAIL PROTECTED]> wrote:
> > > There is no such thing as xml security benchmark, I try to do one myself
> > > several time, but the lack of time always made me postpone.
> > > In order to test my changes I use a slightly modified copy of the old
> > > xmlbench, it only tests inclusive-c14n and enveloping signatures. But it
> > > works for seeing what can be optimized.
> > > I also test again time completion of our test suite, but this becoming
> more
> > > difficult. Because it use to take 120 seconds to run, now it only takes
> 2-3
> > > and the speed improvements are harder to see.
> > >
> > > You can also try a loop of several thousands verification and decoding.
> So
> > > you can try to measure the speed up.
> > >
> > > If I can help you, don't hesitate in asking.
> > >
> > > Regards,
> > >
> > > Raul
> > >
> > >
> > >
> > > On 4/7/07, jason marshall < [EMAIL PROTECTED]> wrote:
> > > > There's a spot in the MessageDigest calculations where all but one
> > > > path through the code doubles up Buffered I/O and Block I/O.  I've
> > > > been working on a patch to hoist the buffering into the one path that
> > > > needs it.  This gives my path a nice little throughput boost, but I'm
> > > > wondering if I've caused regressions along the other paths.
> > > >
> > > > Raul, do you have a set of benchmarks you were using when doing the
> > > > 1.4 tuning work?  Before I try to push a patch through the complicated
> > > > IP system at work, I'd like to have an idea if it will get accepted or
> > > > not.
> > > >
> > > > --
> > > > - Jason
> > > >
> > >
> > >
> > >
> > > --
> > > http://r-bg.com
> >
> >
> > --
> > - Jason
> >
>
>
>
> --
>  http://r-bg.com


-- 
- Jason

Re: Benchmarks?

Reply via email to