Re: [gdal-dev] Re: JAVA API - Performance

Ivan Wed, 11 Nov 2009 05:12:09 -0800

Even,

You are right. The point is how to take full advantage of the GDAL Java API choosing the rightapproach to deal with the raster buffer on the client side.


Best regards,

Ivan

Even Rouault wrote:

Selon Ivan <[email protected]>:

Ivan,

I'm not sure what you are really measuring if you compare a C++ code versus its
translation to Java code. I think it just reflects the known slowdown of Java
when doing intensive computations in comparison to native code. The 0.2 second
difference between the regular array version and the ByteBuffer one is the
interesting result, not the 1.2/1.0 second difference between C++ and Java.

Caio Simone,

I just downloaded imageio-ext to check how it does that but it looks like I
don't need to do that now, I can take you report instead. Thank you very
much. I will take a look on array pinning for a start.

I translated the GDAL Proximity [1] code to Java and I timed both of then
with the same input, a 1024x1024 byte image with just one pixel as feature at
the center of the image.

It took 0.3 seconds in C++ and 1.5 seconds in Java!

I then translated the buffers to regular arrays and it went down a little
bit, 1.3 seconds.

It is still a big disadvantage. I believe that the buffer-to-buffer
translation is the guilt time waster in that case.

[1] http://trac.osgeo.org/gdal/browser/trunk/gdal/alg/gdalproximity.cpp

My best regards,

Ivan

 -------Original Message-------
 From: Simone Giannecchini <[email protected]>
 Subject: Re: [gdal-dev] Re: JAVA API - Performance
 Sent: Nov 10 '09 12:36

 Ciao Even,
 just wanted to add my 2 cents.

 As you know for the imageio-ext project we have been using the
 GDAL-JNI bindings (actually a modified version of them) for a while in
 order to allow Java users to leverage on GDAL using the ImageIO
 framework which standard in Java.
 This way we also enabled GeoTools and GeoServer to use GDAL as a

datasource.

 In the past I have done quite some performance tests to add some
 new/different methods to them and I can summarise our findings as
 follows:

 - DirectByteBuffer vs regular arrays -
 DBB is expensive to allocate but prevent the VM from performing copies
 when having to move data to and from java and native code since they
 live on the native space not on the java heap; On the other side the
 regular arrays are fast to allocate but they are "usually" copied when
 moved across from/to java and native code since the JVM cannot leave
 the native code mess with the java heap space since the garbage
 collector would not be very happy about that. I said "usually" since
 there is a technique called array pinning that we can suggest the JVM
 to use to avoid the copy of regular array; however this mechanism is
 not guaranteed to be implemented and/or to work on each call (same
 reason as above, GC is not happy about this technique).

 If you can pool the DBB  and/or use a few large DBB, where the cost of
 the copy would overcome the cost of its creation then DBB are much
 better than regular arrays. As an instance I noticed that using when
 reading striped tiff files regular arrays where faster, but as the
 tile size increases (and therefore the cost of a copy overcomes the
 cost of a DBB creation) the DBB performs much better

 - DirectByteBuffer and the impact on some JVM -
 Now in the past we decided to stick with DBB and give
 GeoServer/GeoTools users the capability to retile data on the fly.
 However lately, during the WMS performance shootout we noticed on some
 linux machines JVm soldi crashed, not nice (means restarting the
 GeoServer!!!).
 We investigated a bit in depth and the problem was that somehow the
 JVM was failing to allocate some internal images during the rendering
 process and then dying with a NullPointerException (apparently the SUN
 Java2D engineers did not use to check for out of memory errors in the
 java native space). Well, what happens is that if you use too much of
 the Java native space for your own objects, it is likely that the JVM
 itself will start to malfunction (you can find articles on the web on
 the memory model of a Java process, I don't think I am good enough to
 explain it ) since it cannot allocate its own objects.

 In the end we decide to leave DBB and go back to regular arrays with
 array pinning. This ensured us robustness and we did not see much
 performance degradation (which means that array pinning in the end
 works). This has been implemented by modifying the SWIG bindings for
 GDAL in order to use a byte array instead of a DBB and then use
 ByteArray utils to convert between different native type (short, int,
 etc..).

 - Conclusion -
 We might want to spend some time in the mid term to contribute some of
 this work back (or probably provide funding), but anyway, it would be
 great to have the capability to switch between DBB and regular arrays
 since both have flaws.
 However atm if I were asked I would say to go with regular arrays as
 we do in the imageio-ext project.

 Ciao,
 Simone.
 -------------------------------------------------------
 Ing. Simone Giannecchini
 GeoSolutions S.A.S.
 Founder - Software Engineer
 Via Carignoni 51
 55041  Camaiore (LU)
 Italy

 phone: +39 0584983027
 fax:      +39 0584983027
 mob:    +39 333 8128928


 http://www.geo-solutions.it
 http://geo-solutions.blogspot.com/
 http://simboss.blogspot.com/
 http://www.linkedin.com/in/simonegiannecchini

 -------------------------------------------------------



 On Tue, Nov 10, 2009 at 12:00 PM, Even Rouault
 <[email protected]> wrote:
 > Selon Ivan <[email protected]>:
 >
 > Ivan,
 >
 > thanks for your testing (CC'ing the list as it is of general interest).
 > Actually, I also read on some sites that using ByteBuffer object versus

regular

 > Java arrays is not always a win. Plus the fact that we must use a direct

buffer

 > that has an extra allocation cost according to the Javadoc. So

ByteBuffer might

 > be interesting if you just want to pass big arrays between native code,

for

 > example if you read an array from a dataset and then write it to another

one

 > without accessing it from the Java side. When you mention that accessing

through

 > the byte[] array was faster, did you get it with the array() method

instead ?

 > I'm wondering what the performance overhead of this call is.
 >
 > As ByteBuffer is not at all a requirement for the interface with the

native

 > code, it would be technically possible to add an alternative API that

would use

 > the regular Java array types.
 >
 > Would you mind opening an enhancement ticket about that ? Thanks
 >
 > Even
 >
 >> Even,
 >>
 >> I did some test with the GDAL Java API and some simple raster

operations

 >> like the GDAL Proximity algorthm and I noticed that the performance

while

 >> accessing pixels with <type>Buffer.get(i), <type>Buffer.put(i,value) is

not

 >> as good as if you copy then to (or from) a "regular" array, like

float[],

 >> double[], integer[] and byte[].
 >>
 >> The reason for that is obvious, get() and put() are funtion calls and
 >> contains a lot of code for range check.
 >>
 >> If I understand it correctly, ByteBuffer is the ideal or maybe the only
 >> way to get access to Buffers from C libraries thought a Java wrapper.

But

 >> do you it would be possible to incapsulate the buffer conversion at the
 >> wrapper code so that users would be able to read and write direct to
 >> regular Java arrays?
 >>
 >> Just a suggestion,
 >>
 >> Ivan
 >>
 >
 >
 > _______________________________________________
 > gdal-dev mailing list
 > [email protected]
 > http://lists.osgeo.org/mailman/listinfo/gdal-dev
 >


_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Re: JAVA API - Performance

Reply via email to