Re: memory issues using TDB from servlet

Andy Seaborne Tue, 26 Jun 2012 03:24:52 -0700

a few questions and then a a suggestion:

How much physical RAM does the machine have?
Which version of the Jena software is this?
Is this running on MS Windows?

If you are on 64 bit hardware, then TDB uses out-of-heap memory as wellas heap memory.


But what I am most suspicious of is

Dataset dataset = getDataset();
...
dataset.close();

which seems to opening the database on every call which may be the causeof your problems. You may have many copies of the in-RAM datastructres(especially on 64-bit Windows which does not release mnamory mappedsegments during the lifetime of the JVM - (in)famous Java bug).

You should open the database once at start up, do not close it when arequest is finished.

With transactions, you can get away with not closing it at all but to beneat, close at shutdown if you like.

Otherwise, could you turn this into a standalone test case thatsimulates your set up but runs outside Spring so we can debug it?


        Andy

On 25/06/12 21:55, Stephan Zednik wrote:

I have been having memory issues using TDB from a java servlet.  Memory usage 
by tomcat increases until the service becomes unresponsive and must be 
restarted.  The service operations appear to be completing successfully until 
the service becomes unresponsive.

The memory usage will rapidly rise to whatever my heap max size 
(CATALINA_OPTS="-Xms512m -Xmx4096m") or what my available RAM can hold before 
the service becomes unresponsive.  Generally during testing that has been 1.5-1.6 GB 
before my RAM is full up.

I have a fairly simple set of unit tests, it does not have full coverage but 
what tests I do have all pass.

I am using Spring web.

Below is my Spring Controller class, it asks the application context for a 
Dataset which causes Spring to invoke DatasetFactoryBean.getObject().  The 
DatasetFactoryBean is a singleton has been initialized with location to my TDB 
dataset.

The controller method is fairly simple.  A post request contains XML payload.  
The payload is passed to a service method that parses the XML and generates an 
RDF representation of the input data, encoded as RDF and stored in an in-memory 
jena model.  AnalysisSettings is a class that acts as a proxy to the Jena Model 
with methods for manipulating/accessing the encoded RDF.

I have commented out the TDB-related code and tested both the xml parsing and 
xml parsing + in-memory rdf.  Service memory usage slowly grows to a level I am 
unhappy with (~1GB according to ActivityMonitor.app and VisualVM), but does 
stabilize.  Since it stabilizes and grows slowly I do not think it is the main 
culprit of my current memory problem.

If I test the TDB Dataset creation code, but leave all queries run against the 
TDB dataset commented out, memory usage grows much quicker to the 1.5 GB range 
before my RAM is full and the service becomes unresponsive.

My tests against the deployed servlet are to make 1000 requests against the 
service.  I check the response of each request to ensure it succeeded and wait 
10 ms before sending the next request.  Wait between runs of the test suite is 
around 6 seconds.  When TDB Dataset connections are made (but no queries are 
run), the service will become unresponsive within the 3rd of 4th run of the 
test suite, so somewhere in the 4k-5k request range.

Is this an unreasonable test suite?

Perhaps I need to adjust my tomcat configuration?  I am using the default 
except for -Xms and -Xmx.

Here are the relevant methods from my controller class

public class AnalysisSettingsController implements ApplicationContextAware {

         // private vars ...

        private Dataset getDataset() {
                return (Dataset) context.getBean("dataset");
        }

        @RequestMapping(value="/test", method = RequestMethod.POST, consumes = 
{"application/xml", "text/xml"})
        public void test(HttpServletRequest request, HttpServletResponse 
response) throws IOException {
                logger.info("in create(...)");

                OntModel m = 
ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
                try {
                        AnalysisSettings settings = service.load(m, 
request.getInputStream()); // creates rdf representation of input, stores in 
in-memory model (m)

                        try {
                                String id = settings.getIdentifier();
                                String location = 
request.getRequestURL().toString()+"/report/"+id;
                                response.setHeader("Location", location);

                                Dataset dataset = getDataset();
                                logger.info("dataset connection opened");
                                try {
                                        /* commented out during testing
                                        if(service.has(dataset, id)) {
                                                
response.setStatus(HttpServletResponse.SC_FOUND);
                                                return;
                                        }

                                         service.save(dataset, settings);
                                         */
                                        
response.setStatus(HttpServletResponse.SC_CREATED);
                                } finally {
                                        dataset.close();
                                        dataset = null;
                                        logger.info("dataset connection 
closed");
                                }
                        } finally {
                                settings.cleanUp();
                                settings = null;
                        }
                } finally {
                        m.close();
                        m = null;
                }
        }
}

Here is DatasetFactoryBean, it is a singleton that generates Datasets when the 
getObject method is called.  The caller has the responsibility to close 
generated Datasets.

public class DatasetFactoryBean implements FactoryBean<Dataset> {

         // private vars, getter and setter methods

        public Dataset getObject() throws Exception {           
                if(this.assemblerFile != null) {
                        Dataset ds = TDBFactory.assembleDataset(assemblerFile);
                        logger.debug("created dataset from {}", assemblerFile);
                        return ds;
                } else if(this.location != null) {
                        Dataset ds = TDBFactory.createDataset(location);
                        logger.debug("created dataset from {}", location);
                        return ds;
                } else {
                        throw new RuntimeException("neither TDB directory location 
nor assembler file specified");
                }
        }
}

Here is a portion of the AnalysisSettingsTdbDao, it runs queries against a TDB 
Dataset.  The query below is used to see if the TDB dataset already contains an 
AnalysisSettings resource with the provided id.

If this method gets called, memory usage skyrockets and the service quickly 
becomes unresponsive.

public class AnalysisSettingsTdbDao implements AnalysisSettingsDao {

        // private variables, other access methods

        public boolean existsById(Dataset dataset, String id) {         
                if(id == null || id.isEmpty()) { throw new 
IllegalArgumentException("id is null or empty"); }
                final String query = "ASK { ?obj a ?type . ?obj 
<http://purl.org/dc/terms/identifier> ?id . }";
                dataset.begin(ReadWrite.READ);
                                
                try {
                        QuerySolutionMap params = new QuerySolutionMap();
                        params.add("id", ResourceFactory.createTypedLiteral(id, 
XSDDatatype.XSDstring));
                        params.add("type", MDSA.AnalysisSettings);
                        
                        QueryExecution qExec = 
QueryExecutionFactory.create(query, dataset, params);
                        try {
                                return qExec.execAsk();
                        } finally {
                                qExec.close();
                        }
                } finally {
                        dataset.end();
                }
        }
}

Any guidance on where to make corrections to my code or how to adjust my tomcat 
or TDB settings is appreciated.

--Stephan

Re: memory issues using TDB from servlet

Reply via email to