a few questions and then a a suggestion:
How much physical RAM does the machine have?
Which version of the Jena software is this?
Is this running on MS Windows?
If you are on 64 bit hardware, then TDB uses out-of-heap memory as well
as heap memory.
But what I am most suspicious of is
Dataset dataset = getDataset();
...
dataset.close();
which seems to opening the database on every call which may be the cause
of your problems. You may have many copies of the in-RAM datastructres
(especially on 64-bit Windows which does not release mnamory mapped
segments during the lifetime of the JVM - (in)famous Java bug).
You should open the database once at start up, do not close it when a
request is finished.
With transactions, you can get away with not closing it at all but to be
neat, close at shutdown if you like.
Otherwise, could you turn this into a standalone test case that
simulates your set up but runs outside Spring so we can debug it?
Andy
On 25/06/12 21:55, Stephan Zednik wrote:
I have been having memory issues using TDB from a java servlet. Memory usage
by tomcat increases until the service becomes unresponsive and must be
restarted. The service operations appear to be completing successfully until
the service becomes unresponsive.
The memory usage will rapidly rise to whatever my heap max size
(CATALINA_OPTS="-Xms512m -Xmx4096m") or what my available RAM can hold before
the service becomes unresponsive. Generally during testing that has been 1.5-1.6 GB
before my RAM is full up.
I have a fairly simple set of unit tests, it does not have full coverage but
what tests I do have all pass.
I am using Spring web.
Below is my Spring Controller class, it asks the application context for a
Dataset which causes Spring to invoke DatasetFactoryBean.getObject(). The
DatasetFactoryBean is a singleton has been initialized with location to my TDB
dataset.
The controller method is fairly simple. A post request contains XML payload.
The payload is passed to a service method that parses the XML and generates an
RDF representation of the input data, encoded as RDF and stored in an in-memory
jena model. AnalysisSettings is a class that acts as a proxy to the Jena Model
with methods for manipulating/accessing the encoded RDF.
I have commented out the TDB-related code and tested both the xml parsing and
xml parsing + in-memory rdf. Service memory usage slowly grows to a level I am
unhappy with (~1GB according to ActivityMonitor.app and VisualVM), but does
stabilize. Since it stabilizes and grows slowly I do not think it is the main
culprit of my current memory problem.
If I test the TDB Dataset creation code, but leave all queries run against the
TDB dataset commented out, memory usage grows much quicker to the 1.5 GB range
before my RAM is full and the service becomes unresponsive.
My tests against the deployed servlet are to make 1000 requests against the
service. I check the response of each request to ensure it succeeded and wait
10 ms before sending the next request. Wait between runs of the test suite is
around 6 seconds. When TDB Dataset connections are made (but no queries are
run), the service will become unresponsive within the 3rd of 4th run of the
test suite, so somewhere in the 4k-5k request range.
Is this an unreasonable test suite?
Perhaps I need to adjust my tomcat configuration? I am using the default
except for -Xms and -Xmx.
Here are the relevant methods from my controller class
public class AnalysisSettingsController implements ApplicationContextAware {
// private vars ...
private Dataset getDataset() {
return (Dataset) context.getBean("dataset");
}
@RequestMapping(value="/test", method = RequestMethod.POST, consumes =
{"application/xml", "text/xml"})
public void test(HttpServletRequest request, HttpServletResponse
response) throws IOException {
logger.info("in create(...)");
OntModel m =
ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
try {
AnalysisSettings settings = service.load(m,
request.getInputStream()); // creates rdf representation of input, stores in
in-memory model (m)
try {
String id = settings.getIdentifier();
String location =
request.getRequestURL().toString()+"/report/"+id;
response.setHeader("Location", location);
Dataset dataset = getDataset();
logger.info("dataset connection opened");
try {
/* commented out during testing
if(service.has(dataset, id)) {
response.setStatus(HttpServletResponse.SC_FOUND);
return;
}
service.save(dataset, settings);
*/
response.setStatus(HttpServletResponse.SC_CREATED);
} finally {
dataset.close();
dataset = null;
logger.info("dataset connection
closed");
}
} finally {
settings.cleanUp();
settings = null;
}
} finally {
m.close();
m = null;
}
}
}
Here is DatasetFactoryBean, it is a singleton that generates Datasets when the
getObject method is called. The caller has the responsibility to close
generated Datasets.
public class DatasetFactoryBean implements FactoryBean<Dataset> {
// private vars, getter and setter methods
public Dataset getObject() throws Exception {
if(this.assemblerFile != null) {
Dataset ds = TDBFactory.assembleDataset(assemblerFile);
logger.debug("created dataset from {}", assemblerFile);
return ds;
} else if(this.location != null) {
Dataset ds = TDBFactory.createDataset(location);
logger.debug("created dataset from {}", location);
return ds;
} else {
throw new RuntimeException("neither TDB directory location
nor assembler file specified");
}
}
}
Here is a portion of the AnalysisSettingsTdbDao, it runs queries against a TDB
Dataset. The query below is used to see if the TDB dataset already contains an
AnalysisSettings resource with the provided id.
If this method gets called, memory usage skyrockets and the service quickly
becomes unresponsive.
public class AnalysisSettingsTdbDao implements AnalysisSettingsDao {
// private variables, other access methods
public boolean existsById(Dataset dataset, String id) {
if(id == null || id.isEmpty()) { throw new
IllegalArgumentException("id is null or empty"); }
final String query = "ASK { ?obj a ?type . ?obj
<http://purl.org/dc/terms/identifier> ?id . }";
dataset.begin(ReadWrite.READ);
try {
QuerySolutionMap params = new QuerySolutionMap();
params.add("id", ResourceFactory.createTypedLiteral(id,
XSDDatatype.XSDstring));
params.add("type", MDSA.AnalysisSettings);
QueryExecution qExec =
QueryExecutionFactory.create(query, dataset, params);
try {
return qExec.execAsk();
} finally {
qExec.close();
}
} finally {
dataset.end();
}
}
}
Any guidance on where to make corrections to my code or how to adjust my tomcat
or TDB settings is appreciated.
--Stephan