On Jun 26, 2012, at 4:24 AM, Andy Seaborne wrote:

> a few questions and then a a suggestion:
> 
> How much physical RAM does the machine have?

4 GB

> Which version of the Jena software is this?

0.9.0-incubating (set via Maven)

> Is this running on MS Windows?

Mac OSX 10.7.4

> 
> If you are on 64 bit hardware, then TDB uses out-of-heap memory as well as 
> heap memory.

I am on 64bit software.

> 
> But what I am most suspicious of is
> 
> Dataset dataset = getDataset();
> ...
> dataset.close();

Ah. I thought opening a Dataset was like opening a JDBC connection, and I could 
consequently open and close Datasets as needed.

> 
> which seems to opening the database on every call which may be the cause of 
> your problems.  You may have many copies of the in-RAM datastructres 
> (especially on 64-bit Windows which does not release mnamory mapped segments 
> during the lifetime of the JVM - (in)famous Java bug).
> 
> You should open the database once at start up, do not close it when a request 
> is finished.
> 
> With transactions, you can get away with not closing it at all but to be 
> neat, close at shutdown if you like.

OK, I will modify DatasetFactoryBean return only one instance of Dataset and 
add logic to shutdown the Dataset at servlet close.

> 
> Otherwise, could you turn this into a standalone test case that simulates 
> your set up but runs outside Spring so we can debug it?

Taking it outside of Spring would require a great deal of refactoring, it would 
be easier to send my full project (built via maven).

First though, I will make the change suggested above and report back to the 
list.

--Stephan

> 
>       Andy
> 
> On 25/06/12 21:55, Stephan Zednik wrote:
>> I have been having memory issues using TDB from a java servlet.  Memory 
>> usage by tomcat increases until the service becomes unresponsive and must be 
>> restarted.  The service operations appear to be completing successfully 
>> until the service becomes unresponsive.
>> 
>> The memory usage will rapidly rise to whatever my heap max size 
>> (CATALINA_OPTS="-Xms512m -Xmx4096m") or what my available RAM can hold 
>> before the service becomes unresponsive.  Generally during testing that has 
>> been 1.5-1.6 GB before my RAM is full up.
>> 
>> I have a fairly simple set of unit tests, it does not have full coverage but 
>> what tests I do have all pass.
>> 
>> I am using Spring web.
>> 
>> Below is my Spring Controller class, it asks the application context for a 
>> Dataset which causes Spring to invoke DatasetFactoryBean.getObject().  The 
>> DatasetFactoryBean is a singleton has been initialized with location to my 
>> TDB dataset.
>> 
>> The controller method is fairly simple.  A post request contains XML 
>> payload.  The payload is passed to a service method that parses the XML and 
>> generates an RDF representation of the input data, encoded as RDF and stored 
>> in an in-memory jena model.  AnalysisSettings is a class that acts as a 
>> proxy to the Jena Model with methods for manipulating/accessing the encoded 
>> RDF.
>> 
>> I have commented out the TDB-related code and tested both the xml parsing 
>> and xml parsing + in-memory rdf.  Service memory usage slowly grows to a 
>> level I am unhappy with (~1GB according to ActivityMonitor.app and 
>> VisualVM), but does stabilize.  Since it stabilizes and grows slowly I do 
>> not think it is the main culprit of my current memory problem.
>> 
>> If I test the TDB Dataset creation code, but leave all queries run against 
>> the TDB dataset commented out, memory usage grows much quicker to the 1.5 GB 
>> range before my RAM is full and the service becomes unresponsive.
>> 
>> My tests against the deployed servlet are to make 1000 requests against the 
>> service.  I check the response of each request to ensure it succeeded and 
>> wait 10 ms before sending the next request.  Wait between runs of the test 
>> suite is around 6 seconds.  When TDB Dataset connections are made (but no 
>> queries are run), the service will become unresponsive within the 3rd of 4th 
>> run of the test suite, so somewhere in the 4k-5k request range.
>> 
>> Is this an unreasonable test suite?
>> 
>> Perhaps I need to adjust my tomcat configuration?  I am using the default 
>> except for -Xms and -Xmx.
>> 
>> Here are the relevant methods from my controller class
>> 
>> public class AnalysisSettingsController implements ApplicationContextAware {
>> 
>>         // private vars ...
>> 
>>      private Dataset getDataset() {
>>              return (Dataset) context.getBean("dataset");
>>      }
>> 
>>      @RequestMapping(value="/test", method = RequestMethod.POST, consumes = 
>> {"application/xml", "text/xml"})
>>      public void test(HttpServletRequest request, HttpServletResponse 
>> response) throws IOException {
>>              logger.info("in create(...)");
>> 
>>              OntModel m = 
>> ModelFactory.createOntologyModel(OntModelSpec.OWL_DL_MEM);
>>              try {
>>                      AnalysisSettings settings = service.load(m, 
>> request.getInputStream()); // creates rdf representation of input, stores in 
>> in-memory model (m)
>> 
>>                      try {
>>                              String id = settings.getIdentifier();
>>                              String location = 
>> request.getRequestURL().toString()+"/report/"+id;
>>                              response.setHeader("Location", location);
>> 
>>                              Dataset dataset = getDataset();
>>                              logger.info("dataset connection opened");
>>                              try {
>>                                      /* commented out during testing
>>                                      if(service.has(dataset, id)) {
>>                                              
>> response.setStatus(HttpServletResponse.SC_FOUND);
>>                                              return;
>>                                      }
>> 
>>                                         service.save(dataset, settings);
>>                                         */
>>                                      
>> response.setStatus(HttpServletResponse.SC_CREATED);
>>                              } finally {
>>                                      dataset.close();
>>                                      dataset = null;
>>                                      logger.info("dataset connection 
>> closed");
>>                              }
>>                      } finally {
>>                              settings.cleanUp();
>>                              settings = null;
>>                      }
>>              } finally {
>>                      m.close();
>>                      m = null;
>>              }
>>      }
>> }
>> 
>> Here is DatasetFactoryBean, it is a singleton that generates Datasets when 
>> the getObject method is called.  The caller has the responsibility to close 
>> generated Datasets.
>> 
>> public class DatasetFactoryBean implements FactoryBean<Dataset> {
>> 
>>         // private vars, getter and setter methods
>> 
>>      public Dataset getObject() throws Exception {           
>>              if(this.assemblerFile != null) {
>>                      Dataset ds = TDBFactory.assembleDataset(assemblerFile);
>>                      logger.debug("created dataset from {}", assemblerFile);
>>                      return ds;
>>              } else if(this.location != null) {
>>                      Dataset ds = TDBFactory.createDataset(location);
>>                      logger.debug("created dataset from {}", location);
>>                      return ds;
>>              } else {
>>                      throw new RuntimeException("neither TDB directory 
>> location nor assembler file specified");
>>              }
>>      }
>> }
>> 
>> Here is a portion of the AnalysisSettingsTdbDao, it runs queries against a 
>> TDB Dataset.  The query below is used to see if the TDB dataset already 
>> contains an AnalysisSettings resource with the provided id.
>> 
>> If this method gets called, memory usage skyrockets and the service quickly 
>> becomes unresponsive.
>> 
>> public class AnalysisSettingsTdbDao implements AnalysisSettingsDao {
>> 
>>      // private variables, other access methods
>> 
>>      public boolean existsById(Dataset dataset, String id) {         
>>              if(id == null || id.isEmpty()) { throw new 
>> IllegalArgumentException("id is null or empty"); }
>>              final String query = "ASK { ?obj a ?type . ?obj 
>> <http://purl.org/dc/terms/identifier> ?id . }";
>>              dataset.begin(ReadWrite.READ);
>>                              
>>              try {
>>                      QuerySolutionMap params = new QuerySolutionMap();
>>                      params.add("id", ResourceFactory.createTypedLiteral(id, 
>> XSDDatatype.XSDstring));
>>                      params.add("type", MDSA.AnalysisSettings);
>>                      
>>                      QueryExecution qExec = 
>> QueryExecutionFactory.create(query, dataset, params);
>>                      try {
>>                              return qExec.execAsk();
>>                      } finally {
>>                              qExec.close();
>>                      }
>>              } finally {
>>                      dataset.end();
>>              }
>>      }
>> }
>> 
>> Any guidance on where to make corrections to my code or how to adjust my 
>> tomcat or TDB settings is appreciated.
>> 
>> --Stephan
>> 
> 
> 
> 

Reply via email to