Re: [Dspace-tech] index-lucene-update

Tim Donohue Tue, 01 Jul 2014 10:46:40 -0700

Hi Jose,

On 7/1/2014 11:46 AM, Jose Blanco wrote:
> Tim,
>
> Thanks for helping me out with this.  I am using 4.1 code, it's just
> that I have some changes in there particular to our instance.  The
> changes I have I don't think should affect this.  I put in the check
> you suggested, and now I don't get the null exception but get a whole
> bunch of these sorts of errors:
>
> 2014-07-01 12:35:34,764 ERROR org.dspace.search.DSIndexer @
> java.io.FileNotFoundException:
> /dspace/repository/dev/search/_6o8q_Lucene41_0.tip (Too many open
> files)


You are running into a "Too many open files" error. This actually may be 
*unrelated* to the indexing method. It's telling you that your DSpace 
has too many open file descriptors...so, somewhere in your code it is 
opening files without closing them (or just opening too many files at 
once -- the default setting is 1024 on most OS's)

More info on "Too many open files" errors:
* http://stackoverflow.com/a/4289528
* 
http://stackoverflow.com/questions/2272908/java-io-filenotfoundexception-too-many-open-files
(Or google the error, there's other resources out there)

The place where this error shows up is somewhat random...it's going to 
just show up in an area where you are accessing a file after running out 
of available file descriptors.

You can increase the maximum number of opened files on your system 
(ulimit -n), but chances are there may be an issue in some custom code 
you've written (or possibly in DSpace API itself -- though this is less 
likely unless others are hitting this error as well)

You might want to try stopping Tomcat and restarting it & then do the 
full index-lucene-init & index-lucene-update. If it errors again, it's 
possible that the index-lucene-init is somehow keeping files open. E.g. 
Does your search/browse *work* from the UI after an "init"? Or does 
searching from the UI also throw an error?

> I was getting these sort of errors when I was running
> index-lucene-init and I changed these config parameters:
>
> search.index.delay = -1
> search.batch.documents = -1
>
> and index-lucene-init completed successfully.
>
> I'm a bit confused as to whether I need to run index-lucene-update as
> a cron job or not.  When I load an item into my instance, the lucene
> index metadata is updated  because, I have this set ( the search in
> the list ):

index-lucene-update need not be run via cron job. The indexes should be 
updated automatically on each newly submitted item. That being said, 
I've heard some institutions schedule it via a cron job "just in case", 
but it shouldn't be needed.

index-lucene-update is mostly there to *update* your index if you chance 
any settings (e.g. add new search fields, etc.). It's not necessary to 
run on any ongoing basis. But, it also shouldn't fail -- it failing is a 
sign that something is wrong in your index or in DSpace.

- Tim

------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] index-lucene-update

Reply via email to