Hi - We have an ETL application that reads all rows from Cassandra (2.1.2), filters them and stores a small subset in an RDBMS. Our application is using Datastax's Java driver (2.1.4) to fetch data from the C* nodes. Since the Java driver supports automatic paging, I was under the impression that SELECT queries should not cause an OOM error on the C* nodes. However, even with just 16GB data on each nodes, the C* nodes start throwing OOM error as soon as the application starts iterating through the rows of a table.
The application code looks something like this: Statement stmt = new SimpleStatement("SELECT x,y,z FROM cf").setFetchSize(5000); ResultSet rs = session.execute(stmt); while (!rs.isExhausted()){ row = rs.one() process(row) } Even after we reduced the page size to 1000, the C* nodes still crash. C* is running on M3.xlarge machines (4-cores, 15GB). We manually increased the heap size to 8GB just to see how much heap C* consumes. With 10-15 minutes, the heap usage climbs up to 7.6GB. That does not make sense. Either automatic paging is not working or we are missing something. Does anybody have insights as to what could be happening? Thanks. Mohammed