Hi Otis,
thank you for you tips so far. I will use a profiles for finding out,
where the problem is. All your hints were helpful for me. I think the
problem is the connection to mySQL, not the indexer. So I will look
for newsgroup on the connector mysql-connector-java-3.1.10-bin.jar.
Or do you have any experience with memory leaks with this connector
and how to avoid them ?
Bye bye,
Jan
> However, i see a few problems in your code.
> 1) you should take the JDBC code for getting the connection and
> creation of an SQL statement out of that method, so it is not called
> repeteadly - you can reuse the same connection!
>
> 2) I don't see the code to close your statement, connection, and
> ResultSet. Those typically go to a finally block.
I have a finalize method, that release mySQL-resources in reverse-order
of their creation, but I left this out for campacting my problem.
>
> 3) mergeFactor looks suspiciously high.
I have measured the speed of indexing with different values for
mergefactor.
writer.mergeFactor = 250;
writer.minMergeDocs = 250;
produces highest speed for me. I have enough memory for doing this.
But I need to reuse this memory on the next index run ;-)
>
> 4) You opening IndexWriter, optimizing the index and closing it a lot.
> Do you really need to optimize it that often?
I do this once a day. I don“t want to leave the connection open for
about 23:45 hours (indexing takes only 15 mins for 150000 docs).
Do you think it is a problem to load the driver over and over again ?
conn = DriverManager.getConnection("jdbc:mysql://" ...");
>
> The gc() call you are making is just a suggestion for the JVM - "now
> may be a good time to consider running GC". The JVM may ignore this
> suggestion.
>
> Here is a handy method:
>
> private static long gc()
> {
> long freeMemBefore = Runtime.getRuntime().freeMemory();
> System.out.println("Free Memory Before: " + freeMemBefore);
> System.gc();
> try {
> Thread.sleep(1000);
> System.runFinalization();
> Thread.sleep(1000);
> } catch (InterruptedException e) {
> e.printStackTrace();
> }
> System.gc();
> long freeMemAfter = Runtime.getRuntime().freeMemory();
> System.out.println("Total Memory : " +
> Runtime.getRuntime().totalMemory());
> System.out.println("Max Memory : " +
> Runtime.getRuntime().maxMemory());
> System.out.println("Free Memory After : " + freeMemAfter);
> return freeMemBefore-freeMemAfter;
> }
>
>
> Otis
>
>
> --- Jan Philipp Seng <[EMAIL PROTECTED]> wrote:
>
> > I am using the Lucene 1.4.3 API. After building the index over 150000
> > documents (~250 MB data), Lucene does not free the memory that is
> > used
> > during indexing. The searcher runs as a servlet under Tomcat. Every
> > time the
> > index is build new, the indexing process takes free memory, so after
> > ten
> > runs the memory is completly full. I tried to call the garbage
> > collector
> > explicitly, but that does not help.
> > I build the index to harddisc and load it into a RAMDirectory after
> > building. There is no reference to my indexer left after indexing and
> > I
> > cannot find a reason why the garbage collector does not free the
> > memory.
> > Here are some important points of my code reduced to the central
> > functionality. Do you have any ideas, what kind of problem this could
> > be ?
> > An answer would help me a lot.
> >
> >
> > IndexTablesDaemon start the indexing process:
> > IndexTablesDaemon.run():
> > -----------------------------
> > while (true) {
> > IndexTables indexer = new IndexTables();
> > indexer.indexTables(); // indexing if necessary, writing the new
> > index to
> > harddisc
> > indexer = null; // for freeing the memory, does not help
> > FTS.initNewIndex(); // switching the old with the new index
> > Runtime.getRuntime().gc(); // calling the garbage collector. Does
> > not
> > help.
> > Thread.sleep(lWait); // waiting a fix space of time
> > }
> >
> >
> >
> > Building the index: sending queries to a mySQLDatabase,
> >
> > buidling a Lucene-document with the mySQL-data und indexing
> > it:
> > IndexTables.indexTables():
> > --------------------------
> > IndexWriter writer = new IndexWriter(PATHNAME_INDEX_NEW),
> > new TTAnalyser(), true);
> >
> > writer.mergeFactor = 250;
> > writer.minMergeDocs = 250;
> >
> > Document doc = null;
> >
> > String sQuery = "SELECT columns FROM table";
> >
> > Connection conn = DriverManager.getConnection("jdbc:mysql: ...");
> > Statement stmt = conn.createStatement()
> > ResultSet rs = stmt.executeQuery(sQuery);
> >
> > while (m_rs.next()) {
> > doc = getTTDocument();
> > // fill doc with fields from the database query
> > writer.addDocument(doc);
> > }
> >
> > writer.optimize();
> > writer.close();
> > writer = null; // for freeing memory. Does not help
> >
> > Runtime.getRuntime().gc(); // explicitely running the garbage
> > collector.
> > Does not help
> > }
> >
> >
> >
> > Exchanging the old an the new index for queries to the Lucene-index.
> > FTS.initNewIndex():
> > -------------------
> > File oldIndex = new File(sPathIndex);
> > File newIndex = new File(sPathIndexNew);
> >
> > // deleting all files in the old index from harddisc
> > // let the new index become the operating index
> > oldIndex.delete();
> > newIndex.renameTo(oldIndex);
> >
> > // load the new index to RAMDirectory from harddisc
> > SearchTables.reloadIndex();
> >
> >
> > Thanks for your help,
> >
> > Jan Philipp Seng, Germany, Aachen
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]