Re: Big selects on PostGres : Configuring Statement.setFetchSize() of a selectquery

Andrey Razumovsky Mon, 25 May 2009 02:39:32 -0700

I'm using latest postgres (or mysql) driver. To make a test, I put 1M
records in db table, then do "select * from that_table". No matter which
fetch size I tried to set, at the moment of PreparedStastement.execute()
something about 70-100Mb are allocated (I don't actually remember the
quantity right now). So, if heap size is small, I get OutOfMemory.
Otherwise, it's all ok and rows can be quickly iterated. Looking at code on
postgres site, the only difference is that I didn't bother with autocommit
mode. I'm not sure Cayenne does, too... Maybe this is the reason why I
didn't get positive results.


Andrey

2009/5/25 Andrus Adamchik <[email protected]>

> [Forwarding to the dev list. Stéphane, please forward your replies here as
> well or subscribe to dev if you feel like it]
>
> Stéphane, thanks for providing this information. It is encouraging.
>
> The mismatch in the results between you and Andrey is exactly why I wanted
> to do real profiling, as we need to see how the memory allocation changes in
> the process. OutOfMemory may or may not happen depending on many factors.
> What we care about here is the dynamics of ResultSet reading.
>
> I see that Andrey has already committed this code to trunk, so now we have
> something to play with. If you don't mind using a trunk build (doing it
> yourself or grabbing one from Hudson), you can give us the feedback on the
> implementation.
>
> Andrus
>
>
> Begin forwarded message:
>
>> From: Stéphane Claret <[email protected]>
>> Date: May 25, 2009 11:58:52 AM GMT+03:00
>> To: Andrus Adamchik <[email protected]>
>> Subject: Re: Big selects on PostGres : Configuring
>> Statement.setFetchSize() of  a selectquery
>>
>> Hello,
>>
>> To answer, setting fetchSize with postgres does make a difference.
>> If you consider the following code copy-pasted from my test case:
>>
>> PreparedStatement stmt =
>>  sqlMapper.getCurrentConnection().prepareStatement("SELECT
>> itemID,sellingprice,margin,popularityrating,name,att1,att2,att3,att4,att5,longtailpos,timest
>>  FROM itemdetails WHERE profileID= 10 ORDER BY longtailpos");
>>
>>        stmt.setFetchSize(1000);
>>        stmt.execute();
>>        ResultSet set = stmt.executeQuery();
>>
>>        int i = 0;
>>
>>
>>        while( set.next())
>>        {
>>            i++;
>>            System.out.println(set.getString(1));
>>        }
>>
>>
>> It can iterates through approx 1'500'000 rows without any memory issues
>> (java process stays at approx 20mo while the loop executes). I can also see
>> that it takes no more than a few seconds before the first row is printed in
>> the output console, meaning that results are still being fetched from the DB
>> at the moment the resultset is accessed.
>>
>> Now if I comment the line "stmt.setFetchSize(1000);" in the above code,
>> execution stops at "ResultSet set = stmt.executeQuery();" for approx 15
>> seconds and then an OutOfMemoryException is thrown from JDBC code. Also in
>> cayenne code, that would be before the resultset is even accessed (making
>> ResultIerator useless).
>>
>> A similar problem is exposed here, with a sample stacktrace :
>>
>> http://www.nabble.com/java.sql.Statement-generates-java.lang.OutOfMemoryError-in-big-tabe-td21568763.html
>>
>> And the reason is detailed here :
>> http://jdbc.postgresql.org/documentation/83/query.html#query-with-cursor
>>
>> Perhaps other drivers that automatically use server side cursors aren't
>> affected or just ignore this setting but I can say for sure that this
>> setting is necessary with postGres.
>>
>> Andrey, you say you keep getting OutOfMemory exceptions no matter which
>> value you set. Do you get these exceptions while iterating through the
>> results or during the call to executeQuery?
>> If you get it while looping through the resultset, it could mean that
>> somehow the rows objects aren't garbage collected, it can happen with
>> cayenne version 2.0 due to the dataContext holding strong references to the
>> dataObjects. I didn't want to write a fix for v3 at first, but I realized
>> that in v2, I had to manually unregister the dataObjects from the context in
>> order to keep the memory usage in safe areas.
>>
>> If I can be of any help, please ask.
>> Best regards...
>>
>> Stéphane
>>
>>
>> 2009/5/25 Andrus Adamchik <[email protected]>
>>
>> Somehow I am not surprised. I may also try that on Oracle when the code
>> becomes available on trunk.
>>
>> (BTW not sure that Stephane is subscribed to the dev list, so cc'ying this
>> message.)
>>
>> Andrus
>>
>> On May 24, 2009, at 12:52 PM, Andrey Razumovsky wrote:
>> I should say my tests on Postgres and mysql didn't show any results. If I
>> use small heap size, i get OutOfMemory no matter which fetch size was set.
>> Fetch speed and memory usage are the same. Seems drivers just ignore this
>> parameter. Stephane, did your workaround help you?
>>
>> Drivers of hsql, postres, mysql didn't throw any exceptions.. Maybe other
>> DBMS drivers work well with this param. I'm going to commit today or
>> tomorrow. Nothing will break anything if we add this ability.
>>
>> Andrey
>>
>> 2009/5/21 Andrus Adamchik <[email protected]>
>>
>> Cool :-)
>>
>> On May 21, 2009, at 3:59 PM, Andrey Razumovsky wrote:
>>
>> Now when I know of this JDBC feature, I'd prefer to have it in Cayenne
>> sooner that later. We would also want to have it for other queries than
>> just
>> only SelectQuery.
>>
>>
>> Absolutely. This has to be a part of the QueryMetadata on the backend. On
>> the frontend any query that can potentially select data should have a
>> corresponding setter.
>>
>> Could someone help me testing it against different types
>> of database if I commit?
>>
>>
>> I can test it on almost all DB's that we support. Of course we should have
>> Cayenne unit tests that will provide regression (i.e. driver XYZ doesn't
>> throw UnsupportedOperationException when we call a corresponding JDBC
>> method).
>>
>> In addition to that I'd like to see if there's really memory/speed savings
>> when using that (i.e. is it really worth it). For that I suggest writing a
>> JDBC test outside of Cayenne, that can be run in profiler against
>> different
>> DB's.
>>
>> Andrus
>>
>>
>>
>>
>

Re: Big selects on PostGres : Configuring Statement.setFetchSize() of a selectquery

Reply via email to