On Jun 26, 2012, at 4:24 AM, Andy Seaborne wrote:
> a few questions and then a a suggestion:
>
> How much physical RAM does the machine have?
4 GB
> Which version of the Jena software is this?
0.9.0-incubating (set via Maven)
> Is this running on MS Windows?
Mac OSX 10.7.4
>
> If you are on 64 bit hardware, then TDB uses out-of-heap memory as well as
> heap memory.
I am on 64bit software.
>
> But what I am most suspicious of is
>
> Dataset dataset = getDataset();
> ...
> dataset.close();
Ah. I thought opening a Dataset was like opening a JDBC connection, and I could
consequently open and close Datasets as needed.
>
> which seems to opening the database on every call which may be the cause of
> your problems. You may have many copies of the in-RAM datastructres
> (especially on 64-bit Windows which does not release mnamory mapped segments
> during the lifetime of the JVM - (in)famous Java bug).
>
> You should open the database once at start up, do not close it when a request
> is finished.
>
> With transactions, you can get away with not closing it at all but to be
> neat, close at shutdown if you like.
OK, I will modify DatasetFactoryBean return only one instance of Dataset and
add logic to shutdown the Dataset at servlet close.
>
> Otherwise, could you turn this into a standalone test case that simulates
> your set up but runs outside Spring so we can debug it?
Taking it outside of Spring would require a great deal of refactoring, it would
be easier to send my full project (built via maven).
First though, I will make the change suggested above and report back to the
list.
--Stephan
>
> Andy
>
> On 25/06/12 21:55, Stephan Zednik wrote:
>> I have been having memory issues using TDB from a java servlet. Memory
>> usage by tomcat increases until the service becomes unresponsive and must be
>> restarted. The service operations appear to be completing successfully
>> until the service becomes unresponsive.
>>
>> The memory usage will rapidly rise to whatever my heap max size
>> (CATALINA_OPTS="-Xms512m -Xmx4096m") or what my available RAM can hold
>> before the service becomes unresponsive. Generally during testing that has
>> been 1.5-1.6 GB before my RAM is full up.
>>
>> I have a fairly simple set of unit tests, it does not have full coverage but
>> what tests I do have all pass.
>>
>> I am using Spring web.
>>
>> Below is my Spring Controller class, it asks the application context for a
>> Dataset which causes Spring to invoke DatasetFactoryBean.getObject(). The
>> DatasetFactoryBean is a singleton has been initialized with location to my
>> TDB dataset.
>>
>> The controller method is fairly simple. A post request contains XML
>> payload. The payload is passed to a service method that parses the XML and
>> generates an RDF representation of the input data, encoded as RDF and stored
>> in an in-memory jena model. AnalysisSettings is a class that acts as a
>> proxy to the Jena Model with methods for manipulating/accessing the encoded
>> RDF.
>>
>> I have commented out the TDB-related code and tested both the xml parsing
>> and xml parsing + in-memory rdf. Service memory usage slowly grows to a
>> level I am unhappy with (~1GB according to ActivityMonitor.app and
>> VisualVM), but does stabilize. Since it stabilizes and grows slowly I do
>> not think it is the main culprit of my current memory problem.
>>
>> If I test the TDB Dataset creation code, but leave all queries run against
>> the TDB dataset commented out, memory usage grows much quicker to the 1.5 GB
>> range before my RAM is full and the service becomes unresponsive.
>>
>> My tests against the deployed servlet are to make 1000 requests against the
>> service. I check the response of each request to ensure it succeeded and
>> wait 10 ms before sending the next request. Wait between runs of the test
>> suite is around 6 seconds. When TDB Dataset connections are made (but no
>> queries are run), the service will become unresponsive within the 3rd of 4th
>> run of the test suite, so somewhere in the 4k-5k request range.
>>
>> Is this an unreasonable test suite?
>>
>> Perhaps I need to adjust my tomcat configuration? I am using the default
>> except for -Xms and -Xmx.
>>
>> Here are the relevant methods from my controller class
>>
>> public class AnalysisSettingsController implements ApplicationContextAware {
>>
>> // private vars ...
>>
>> private Dataset getDataset() {
>> return (Dataset) context.getBean("dataset");
>> }
>>
>> @RequestMapping(value="/test", method = RequestMethod.POST, consumes =
>> {"application/xml", "text/xml"})
>> public void test(HttpServletRequest request, HttpServletResponse
>> response) throws IOException {
>> logger.info("in create(...)");
>>
>> OntModel m =
>> ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
>> try {
>> AnalysisSettings settings = service.load(m,
>> request.getInputStream()); // creates rdf representation of input, stores in
>> in-memory model (m)
>>
>> try {
>> String id = settings.getIdentifier();
>> String location =
>> request.getRequestURL().toString()+"/report/"+id;
>> response.setHeader("Location", location);
>>
>> Dataset dataset = getDataset();
>> logger.info("dataset connection opened");
>> try {
>> /* commented out during testing
>> if(service.has(dataset, id)) {
>>
>> response.setStatus(HttpServletResponse.SC_FOUND);
>> return;
>> }
>>
>> service.save(dataset, settings);
>> */
>>
>> response.setStatus(HttpServletResponse.SC_CREATED);
>> } finally {
>> dataset.close();
>> dataset = null;
>> logger.info("dataset connection
>> closed");
>> }
>> } finally {
>> settings.cleanUp();
>> settings = null;
>> }
>> } finally {
>> m.close();
>> m = null;
>> }
>> }
>> }
>>
>> Here is DatasetFactoryBean, it is a singleton that generates Datasets when
>> the getObject method is called. The caller has the responsibility to close
>> generated Datasets.
>>
>> public class DatasetFactoryBean implements FactoryBean<Dataset> {
>>
>> // private vars, getter and setter methods
>>
>> public Dataset getObject() throws Exception {
>> if(this.assemblerFile != null) {
>> Dataset ds = TDBFactory.assembleDataset(assemblerFile);
>> logger.debug("created dataset from {}", assemblerFile);
>> return ds;
>> } else if(this.location != null) {
>> Dataset ds = TDBFactory.createDataset(location);
>> logger.debug("created dataset from {}", location);
>> return ds;
>> } else {
>> throw new RuntimeException("neither TDB directory
>> location nor assembler file specified");
>> }
>> }
>> }
>>
>> Here is a portion of the AnalysisSettingsTdbDao, it runs queries against a
>> TDB Dataset. The query below is used to see if the TDB dataset already
>> contains an AnalysisSettings resource with the provided id.
>>
>> If this method gets called, memory usage skyrockets and the service quickly
>> becomes unresponsive.
>>
>> public class AnalysisSettingsTdbDao implements AnalysisSettingsDao {
>>
>> // private variables, other access methods
>>
>> public boolean existsById(Dataset dataset, String id) {
>> if(id == null || id.isEmpty()) { throw new
>> IllegalArgumentException("id is null or empty"); }
>> final String query = "ASK { ?obj a ?type . ?obj
>> <http://purl.org/dc/terms/identifier> ?id . }";
>> dataset.begin(ReadWrite.READ);
>>
>> try {
>> QuerySolutionMap params = new QuerySolutionMap();
>> params.add("id", ResourceFactory.createTypedLiteral(id,
>> XSDDatatype.XSDstring));
>> params.add("type", MDSA.AnalysisSettings);
>>
>> QueryExecution qExec =
>> QueryExecutionFactory.create(query, dataset, params);
>> try {
>> return qExec.execAsk();
>> } finally {
>> qExec.close();
>> }
>> } finally {
>> dataset.end();
>> }
>> }
>> }
>>
>> Any guidance on where to make corrections to my code or how to adjust my
>> tomcat or TDB settings is appreciated.
>>
>> --Stephan
>>
>
>
>