On 14.8.2012 16:48, Gandhi, Shailey wrote:

Hi

We are running opengrok-0.11-rc1 on a Linux box.

I am running the indexing without the project option i.e Enable_Projects off.

However for the last 2 nights the indexer failed with the following error:

===

2012-08-13 15:53:58.338-0400 INFO t11 DefaultIndexChangedListener.fileAdd: Add: /testdatamg~1.0.0/testdatamg/TDMS ETL/Mainframe Migration seq 3 22 Oct, 2010/ADDRESS (PlainAnalyzer)

2012-08-13 15:54:03.506-0400 SEVERE t11 IndexDatabase$1.run: Problem updating lucene index database:

java.lang.OutOfMemoryError: Java heap space


this is about how much memory you gave to indexer - most probably your index(since not broken per project) is so big, it won't fit (check OpenGrok script on how to increase it, by default there is 2048 GB assigned to jvm running indexer)
also please use 0.11.1 if possible


===

The ADDRESS file reported in the errors above , is a text file but is 1gb in size

I have the following questions:

1. Given that its a text file and not one that contains any data structures. OpenGrok indexer is expected to read it line by line

And should not have heap out of memory error. So, why this problem is coming up?


because of how lucene works (and optimizes/merges index at the end, opengrok has an option to turn off these merges, see indexer options, but I have a bad feeling lucene does this by default since 3.x), new OpenGrok will mitigate this, since it will bring new lucene which has new index structure and doesn't need optimizations anymore

2. Is there a limit on the file size?


no, we turned it off, so no limit, but can be limited though, see OpenGrok script how to set the limit

3. Is there a way to exclude a particular file path, or a directory, or a specific file from indexing. I am aware of ignoring files with certain extensions but this one has no extention.

So, how can I have the indexer to ignore it.


afaik -I ignores per pattern , not per extension, so your awareness isn't correct honestly I'd like this ignoring to work better so it would distinguish between directories and files at least, so imho it's far from perfect, but then it's usable for the time being

--
L

Thanks

Shailey

------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


_______________________________________________
opengrok-discuss mailing list
opengrok-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/opengrok-discuss

_______________________________________________
opengrok-discuss mailing list
opengrok-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/opengrok-discuss

Reply via email to