RE: Segment file not found error - after replicating

2009-11-17 Thread Maduranga Kannangara
Permanent solution we found was to add:

1. flush() before closing the segment.gen file write (On Lucene).
2. Remove the slave's segment.gen before replication


Point 1 elaborated:

Lucene 2.4, org.apache.lucene.index.SegmentInfos.finishCommit(Directory dir) 
method:

Writing of segment.gen file was changed to:

  public final void prepareCommit(Directory dir) throws IOException {
.
.
.

try {
  IndexOutput genOutput = dir.createOutput(IndexFileNames.SEGMENTS_GEN);
  try {
genOutput.writeInt(FORMAT_LOCKLESS);
genOutput.writeLong(generation);
genOutput.writeLong(generation);
  } finally {
  genOutput.flush();   // this is the simple change!
genOutput.close();
  }
} catch (Throwable t) {
  // It's OK if we fail to write this file since it's
  // used only as one of the retry fallbacks.
}

  }


I believe, if this makes sense, we should add this simple line in Lucene! :-)


However, since Java Replication in Solr 1.4, an application level process, 
should have already solved this issue in another way as well.
Yet to test it.


Thanks
Madu


-Original Message-
From: Maduranga Kannangara
Sent: Monday, 16 November 2009 2:39 PM
To: solr-user@lucene.apache.org
Subject: RE: Segment file not found error - after replicating

Yes, I too believed so..

The logic in earlier said method does the gen number calculation using 
segment files available (genA) and using segment.gen file content (genB). Which 
ever larger, would be the gen number used to look up for segment file.

When the file is not properly replicated (due to that is not being written to 
hard disk, or rsync ed) and segment gen number in the segment.gen file (genB) 
is larger than the file based calculation (genA) we hit the pre-said issue.

Cheers
Madu


-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: Monday, 16 November 2009 2:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
 Just found out the root cause:

 * The segments.gen file does not get replicated to slave all the time.

 For some reason, this small (20bytes) file lives in memory and does not get 
 updated to the master's hard disk. Therefore it is not obviously transferred 
 to slaves.

 Solution was to shut down the master web app (must be a clean shut down!, not 
 kill of Tomcat). Then do the replication.

 Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
 does not seem to copy over this file too. So enforcing in the replication 
 scripts solved the problem.

 Thanks Otis and everyone for all your support!

 Madu


 -Original Message-
 From: Maduranga Kannangara
 Sent: Monday, 16 November 2009 12:37 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Segment file not found error - after replicating

 Yes. We have tried Solr 1.4 and so far its been great success.

 Still I am investigating why Solr 1.3 gave an issue like before.

 Currently seems to me 
 org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
 figure out correct segment file name. (May be index replication issue -- 
 leading to not fully replicated.. but its so hard to believe as both master 
 and slave are having 100% same data now!)

 Anyway.. will keep on trying till I find something useful.. and will let you 
 know.


 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Wednesday, 11 November 2009 10:03 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It sounds like your index is not being fully replicated.  I can't tell why, 
 but I can suggest you try the new Solr 1.4 replication.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 

 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart 
 the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter

Re: Segment file not found error - after replicating

2009-11-17 Thread Mark Miller
Maduranga Kannangara wrote:
 Permanent solution we found was to add:

 1. flush() before closing the segment.gen file write (On Lucene).
   
Hmm ... but close does flush?


 2. Remove the slave's segment.gen before replication


 Point 1 elaborated:

 Lucene 2.4, org.apache.lucene.index.SegmentInfos.finishCommit(Directory dir) 
 method:

 Writing of segment.gen file was changed to:

   public final void prepareCommit(Directory dir) throws IOException {
 .
 .
 .

 try {
   IndexOutput genOutput = dir.createOutput(IndexFileNames.SEGMENTS_GEN);
   try {
 genOutput.writeInt(FORMAT_LOCKLESS);
 genOutput.writeLong(generation);
 genOutput.writeLong(generation);
   } finally {
   genOutput.flush();   // this is the simple change!
 genOutput.close();
   }
 } catch (Throwable t) {
   // It's OK if we fail to write this file since it's
   // used only as one of the retry fallbacks.
 }

   }


 I believe, if this makes sense, we should add this simple line in Lucene! :-)


 However, since Java Replication in Solr 1.4, an application level process, 
 should have already solved this issue in another way as well.
 Yet to test it.


 Thanks
 Madu


 -Original Message-
 From: Maduranga Kannangara
 Sent: Monday, 16 November 2009 2:39 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Segment file not found error - after replicating

 Yes, I too believed so..

 The logic in earlier said method does the gen number calculation using 
 segment files available (genA) and using segment.gen file content (genB). 
 Which ever larger, would be the gen number used to look up for segment file.

 When the file is not properly replicated (due to that is not being written to 
 hard disk, or rsync ed) and segment gen number in the segment.gen file (genB) 
 is larger than the file based calculation (genA) we hit the pre-said issue.

 Cheers
 Madu


 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com]
 Sent: Monday, 16 November 2009 2:19 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Thats odd - that file is normally not used - its a backup method to
 figure out the current generation in case it cannot be determined with a
 directory listing - its basically for NFS.

 Maduranga Kannangara wrote:
   
 Just found out the root cause:

 * The segments.gen file does not get replicated to slave all the time.

 For some reason, this small (20bytes) file lives in memory and does not get 
 updated to the master's hard disk. Therefore it is not obviously transferred 
 to slaves.

 Solution was to shut down the master web app (must be a clean shut down!, 
 not kill of Tomcat). Then do the replication.

 Also, if the timestamp/size (size won't change anyway!) is not changed, 
 Rsync does not seem to copy over this file too. So enforcing in the 
 replication scripts solved the problem.

 Thanks Otis and everyone for all your support!

 Madu


 -Original Message-
 From: Maduranga Kannangara
 Sent: Monday, 16 November 2009 12:37 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Segment file not found error - after replicating

 Yes. We have tried Solr 1.4 and so far its been great success.

 Still I am investigating why Solr 1.3 gave an issue like before.

 Currently seems to me 
 org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
 figure out correct segment file name. (May be index replication issue -- 
 leading to not fully replicated.. but its so hard to believe as both 
 master and slave are having 100% same data now!)

 Anyway.. will keep on trying till I find something useful.. and will let you 
 know.


 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Wednesday, 11 November 2009 10:03 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It sounds like your index is not being fully replicated.  I can't tell why, 
 but I can suggest you try the new Solr 1.4 replication.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 

 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart 
 the
 Solr web-app from Tomcat manager, this resolves. But it's just a work 
 around,
 never good enough for the production deployments.

 My next plan is to do a remote debug

RE: Segment file not found error - after replicating

2009-11-15 Thread Maduranga Kannangara
Yes. We have tried Solr 1.4 and so far its been great success.

Still I am investigating why Solr 1.3 gave an issue like before.

Currently seems to me 
org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
figure out correct segment file name. (May be index replication issue -- 
leading to not fully replicated.. but its so hard to believe as both master 
and slave are having 100% same data now!)

Anyway.. will keep on trying till I find something useful.. and will let you 
know.


Thanks
Madu


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: Wednesday, 11 November 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It sounds like your index is not being fully replicated.  I can't tell why, but 
I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) master 
 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: Maduranga Kannangara
  To: solr-user@lucene.apache.org
  Sent: Mon, November 9, 2009 7:47:04 PM
  Subject: RE: Segment file not found error - after replicating
 
  Thanks Otis!
 
  Yes, I checked the index directories and they are 100% same, both timestamp
 and
  size wise.
 
  Not all the slaves face this issue. I would say roughly 50% has this 
  trouble.
 
  Logs do not have any errors too :-(
 
  Any other things I should do/look at?
 
  Cheers
  Madu
 
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
  Sent: Tuesday, 10 November 2009 9:26 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Segment file not found error - after replicating
 
  It's hard to troubleshoot blindly like this, but have you tried manually
  comparing the contents of the index dir on the master and on the slave(s)?
  If they are out of sync, have you tried forcing of replication to see if one
 of
  the subsequent replication attempts gets the dirs in sync?
  Do you have more than 1 slave and do they all start having this problem at 
  the
  same time?
  Any errors in the logs for any of the scripts involved in replication in 
  1.3?
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
   From: Maduranga Kannangara
   To: solr-user@lucene.apache.org
   Sent: Sun, November 8, 2009 10:30:44 PM
   Subject: Segment file not found error - after replicating
  
   Hi guys,
  
   We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
   environment and use the replication scripts to make replicas those live in
  load
   balancing slaves.
  
   The issue we face quite often (only in Linux servers) is that they tend to
 not
 
   been able to find the segment file (segment_x etc) after the replicating
   completed. As this has become quite common, we started hitting a serious
  issue.
  
   Below is a stack trace, if that helps and any help on this matter is 
   greatly
   appreciated.
  
   
  
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader
 load
   INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers

RE: Segment file not found error - after replicating

2009-11-15 Thread Maduranga Kannangara
Just found out the root cause:

* The segments.gen file does not get replicated to slave all the time.

For some reason, this small (20bytes) file lives in memory and does not get 
updated to the master's hard disk. Therefore it is not obviously transferred to 
slaves.

Solution was to shut down the master web app (must be a clean shut down!, not 
kill of Tomcat). Then do the replication.

Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
does not seem to copy over this file too. So enforcing in the replication 
scripts solved the problem.

Thanks Otis and everyone for all your support!

Madu


-Original Message-
From: Maduranga Kannangara
Sent: Monday, 16 November 2009 12:37 PM
To: solr-user@lucene.apache.org
Subject: RE: Segment file not found error - after replicating

Yes. We have tried Solr 1.4 and so far its been great success.

Still I am investigating why Solr 1.3 gave an issue like before.

Currently seems to me 
org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
figure out correct segment file name. (May be index replication issue -- 
leading to not fully replicated.. but its so hard to believe as both master 
and slave are having 100% same data now!)

Anyway.. will keep on trying till I find something useful.. and will let you 
know.


Thanks
Madu


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: Wednesday, 11 November 2009 10:03 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It sounds like your index is not being fully replicated.  I can't tell why, but 
I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) master 
 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: Maduranga Kannangara
  To: solr-user@lucene.apache.org
  Sent: Mon, November 9, 2009 7:47:04 PM
  Subject: RE: Segment file not found error - after replicating
 
  Thanks Otis!
 
  Yes, I checked the index directories and they are 100% same, both timestamp
 and
  size wise.
 
  Not all the slaves face this issue. I would say roughly 50% has this 
  trouble.
 
  Logs do not have any errors too :-(
 
  Any other things I should do/look at?
 
  Cheers
  Madu
 
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
  Sent: Tuesday, 10 November 2009 9:26 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Segment file not found error - after replicating
 
  It's hard to troubleshoot blindly like this, but have you tried manually
  comparing the contents of the index dir on the master and on the slave(s)?
  If they are out of sync, have you tried forcing of replication to see if one
 of
  the subsequent replication attempts gets the dirs in sync?
  Do you have more than 1 slave and do they all start having this problem at 
  the
  same time?
  Any errors in the logs for any of the scripts involved in replication in 
  1.3?
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
   From: Maduranga Kannangara
   To: solr-user@lucene.apache.org
   Sent: Sun, November 8

Re: Segment file not found error - after replicating

2009-11-15 Thread Mark Miller
Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
 Just found out the root cause:

 * The segments.gen file does not get replicated to slave all the time.

 For some reason, this small (20bytes) file lives in memory and does not get 
 updated to the master's hard disk. Therefore it is not obviously transferred 
 to slaves.

 Solution was to shut down the master web app (must be a clean shut down!, not 
 kill of Tomcat). Then do the replication.

 Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
 does not seem to copy over this file too. So enforcing in the replication 
 scripts solved the problem.

 Thanks Otis and everyone for all your support!

 Madu


 -Original Message-
 From: Maduranga Kannangara
 Sent: Monday, 16 November 2009 12:37 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Segment file not found error - after replicating

 Yes. We have tried Solr 1.4 and so far its been great success.

 Still I am investigating why Solr 1.3 gave an issue like before.

 Currently seems to me 
 org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
 figure out correct segment file name. (May be index replication issue -- 
 leading to not fully replicated.. but its so hard to believe as both master 
 and slave are having 100% same data now!)

 Anyway.. will keep on trying till I find something useful.. and will let you 
 know.


 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Wednesday, 11 November 2009 10:03 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It sounds like your index is not being fully replicated.  I can't tell why, 
 but I can suggest you try the new Solr 1.4 replication.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
   
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart 
 the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index 
 is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) 
 master 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
 
 From: Maduranga Kannangara
 To: solr-user@lucene.apache.org
 Sent: Mon, November 9, 2009 7:47:04 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis!

 Yes, I checked the index directories and they are 100% same, both timestamp
   
 and
 
 size wise.

 Not all the slaves face this issue. I would say roughly 50% has this 
 trouble.

 Logs do not have any errors too :-(

 Any other things I should do/look at?

 Cheers
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 9:26 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It's hard to troubleshoot blindly like this, but have you tried manually
 comparing the contents of the index dir on the master and on the slave(s)?
 If they are out of sync, have you tried forcing of replication to see if one
   
 of
 
 the subsequent replication attempts gets the dirs in sync?
 Do you have more than 1 slave and do they all start having this problem at 
 the
 same time?
 Any errors in the logs for any of the scripts involved in replication in 
 1.3?

 Otis

RE: Segment file not found error - after replicating

2009-11-15 Thread Maduranga Kannangara
Yes, I too believed so..

The logic in earlier said method does the gen number calculation using 
segment files available (genA) and using segment.gen file content (genB). Which 
ever larger, would be the gen number used to look up for segment file.

When the file is not properly replicated (due to that is not being written to 
hard disk, or rsync ed) and segment gen number in the segment.gen file (genB) 
is larger than the file based calculation (genA) we hit the pre-said issue.

Cheers
Madu


-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: Monday, 16 November 2009 2:19 PM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

Thats odd - that file is normally not used - its a backup method to
figure out the current generation in case it cannot be determined with a
directory listing - its basically for NFS.

Maduranga Kannangara wrote:
 Just found out the root cause:

 * The segments.gen file does not get replicated to slave all the time.

 For some reason, this small (20bytes) file lives in memory and does not get 
 updated to the master's hard disk. Therefore it is not obviously transferred 
 to slaves.

 Solution was to shut down the master web app (must be a clean shut down!, not 
 kill of Tomcat). Then do the replication.

 Also, if the timestamp/size (size won't change anyway!) is not changed, Rsync 
 does not seem to copy over this file too. So enforcing in the replication 
 scripts solved the problem.

 Thanks Otis and everyone for all your support!

 Madu


 -Original Message-
 From: Maduranga Kannangara
 Sent: Monday, 16 November 2009 12:37 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Segment file not found error - after replicating

 Yes. We have tried Solr 1.4 and so far its been great success.

 Still I am investigating why Solr 1.3 gave an issue like before.

 Currently seems to me 
 org.apache.lucene.index.SegmentInfos.FindSegmentFile.run() is not able to 
 figure out correct segment file name. (May be index replication issue -- 
 leading to not fully replicated.. but its so hard to believe as both master 
 and slave are having 100% same data now!)

 Anyway.. will keep on trying till I find something useful.. and will let you 
 know.


 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Wednesday, 11 November 2009 10:03 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It sounds like your index is not being fully replicated.  I can't tell why, 
 but I can suggest you try the new Solr 1.4 replication.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 

 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis,

 I did the du -s for all three index directories as you said right after
 replicating and when I find errors.

 All three gave me the exact same value. This time I found the error in a 
 rather
 small index too (31Mb).

 BTW, if I copy the segment_x file to what Solr is looking for, and restart 
 the
 Solr web-app from Tomcat manager, this resolves. But it's just a work around,
 never good enough for the production deployments.

 My next plan is to do a remote debug to see what exactly happening in the 
 code.

 Any other things I should looking at?
 Any help is really appreciated on this matter.

 Thanks
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 Madu,

 So are you saying that all slaves have the exact same index, and that index 
 is
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit
 this error, while others do not?  Mind listing index directories of 1) 
 master 2)
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors


 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 

 From: Maduranga Kannangara
 To: solr-user@lucene.apache.org
 Sent: Mon, November 9, 2009 7:47:04 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis!

 Yes, I checked the index directories and they are 100% same, both timestamp

 and

 size wise.

 Not all the slaves face this issue. I would say roughly 50% has this 
 trouble.

 Logs do not have any errors too :-(

 Any other things I should do/look at?

 Cheers
 Madu


 -Original Message

RE: Segment file not found error - after replicating

2009-11-10 Thread Maduranga Kannangara
Thanks Otis,

I did the du -s for all three index directories as you said right after 
replicating and when I find errors.

All three gave me the exact same value. This time I found the error in a rather 
small index too (31Mb).

BTW, if I copy the segment_x file to what Solr is looking for, and restart the 
Solr web-app from Tomcat manager, this resolves. But it's just a work around, 
never good enough for the production deployments.

My next plan is to do a remote debug to see what exactly happening in the code.

Any other things I should looking at?
Any help is really appreciated on this matter.

Thanks
Madu


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: Tuesday, 10 November 2009 1:14 PM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

Madu,

So are you saying that all slaves have the exact same index, and that index is 
exactly the same as the one on the master, yet only some of those slaves 
exhibit this error, while others do not?  Mind listing index directories of 1) 
master 2) slave without errors, 3) slave with errors and doing:
du -s /path/to/index/on/master
du -s /path/to/index/on/slave/without/errors
du -s /path/to/index/on/slave/with/errors


Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Mon, November 9, 2009 7:47:04 PM
 Subject: RE: Segment file not found error - after replicating

 Thanks Otis!

 Yes, I checked the index directories and they are 100% same, both timestamp 
 and
 size wise.

 Not all the slaves face this issue. I would say roughly 50% has this trouble.

 Logs do not have any errors too :-(

 Any other things I should do/look at?

 Cheers
 Madu


 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 9:26 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating

 It's hard to troubleshoot blindly like this, but have you tried manually
 comparing the contents of the index dir on the master and on the slave(s)?
 If they are out of sync, have you tried forcing of replication to see if one 
 of
 the subsequent replication attempts gets the dirs in sync?
 Do you have more than 1 slave and do they all start having this problem at the
 same time?
 Any errors in the logs for any of the scripts involved in replication in 1.3?

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: Maduranga Kannangara
  To: solr-user@lucene.apache.org
  Sent: Sun, November 8, 2009 10:30:44 PM
  Subject: Segment file not found error - after replicating
 
  Hi guys,
 
  We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
  environment and use the replication scripts to make replicas those live in
 load
  balancing slaves.
 
  The issue we face quite often (only in Linux servers) is that they tend to 
  not

  been able to find the segment file (segment_x etc) after the replicating
  completed. As this has become quite common, we started hitting a serious
 issue.
 
  Below is a stack trace, if that helps and any help on this matter is greatly
  appreciated.
 
  
 
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created gap: org.apache.solr.highlight.GapFragmenter
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created regex: org.apache.solr.highlight.RegexFragmenter
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created html: org.apache.solr.highlight.HtmlFormatter
  Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
  SEVERE: Could not start SOLR. Check solr/home property
  java.lang.RuntimeException: java.io.FileNotFoundException:
  /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
  at org.apache.solr.core.SolrCore.(SolrCore.java:470)
  at
 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
  at
  org.apache.solr.servlet.SolrDispatchFilter.init

Re: Segment file not found error - after replicating

2009-11-10 Thread Otis Gospodnetic
It sounds like your index is not being fully replicated.  I can't tell why, but 
I can suggest you try the new Solr 1.4 replication.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tue, November 10, 2009 5:42:44 PM
 Subject: RE: Segment file not found error - after replicating
 
 Thanks Otis,
 
 I did the du -s for all three index directories as you said right after 
 replicating and when I find errors.
 
 All three gave me the exact same value. This time I found the error in a 
 rather 
 small index too (31Mb).
 
 BTW, if I copy the segment_x file to what Solr is looking for, and restart 
 the 
 Solr web-app from Tomcat manager, this resolves. But it's just a work around, 
 never good enough for the production deployments.
 
 My next plan is to do a remote debug to see what exactly happening in the 
 code.
 
 Any other things I should looking at?
 Any help is really appreciated on this matter.
 
 Thanks
 Madu
 
 
 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, 10 November 2009 1:14 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating
 
 Madu,
 
 So are you saying that all slaves have the exact same index, and that index 
 is 
 exactly the same as the one on the master, yet only some of those slaves 
 exhibit 
 this error, while others do not?  Mind listing index directories of 1) master 
 2) 
 slave without errors, 3) slave with errors and doing:
 du -s /path/to/index/on/master
 du -s /path/to/index/on/slave/without/errors
 du -s /path/to/index/on/slave/with/errors
 
 
 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
 - Original Message 
  From: Maduranga Kannangara 
  To: solr-user@lucene.apache.org 
  Sent: Mon, November 9, 2009 7:47:04 PM
  Subject: RE: Segment file not found error - after replicating
 
  Thanks Otis!
 
  Yes, I checked the index directories and they are 100% same, both timestamp 
 and
  size wise.
 
  Not all the slaves face this issue. I would say roughly 50% has this 
  trouble.
 
  Logs do not have any errors too :-(
 
  Any other things I should do/look at?
 
  Cheers
  Madu
 
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
  Sent: Tuesday, 10 November 2009 9:26 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Segment file not found error - after replicating
 
  It's hard to troubleshoot blindly like this, but have you tried manually
  comparing the contents of the index dir on the master and on the slave(s)?
  If they are out of sync, have you tried forcing of replication to see if 
  one 
 of
  the subsequent replication attempts gets the dirs in sync?
  Do you have more than 1 slave and do they all start having this problem at 
  the
  same time?
  Any errors in the logs for any of the scripts involved in replication in 
  1.3?
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
   From: Maduranga Kannangara
   To: solr-user@lucene.apache.org
   Sent: Sun, November 8, 2009 10:30:44 PM
   Subject: Segment file not found error - after replicating
  
   Hi guys,
  
   We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux
   environment and use the replication scripts to make replicas those live in
  load
   balancing slaves.
  
   The issue we face quite often (only in Linux servers) is that they tend 
   to 
 not
 
   been able to find the segment file (segment_x etc) after the replicating
   completed. As this has become quite common, we started hitting a serious
  issue.
  
   Below is a stack trace, if that helps and any help on this matter is 
   greatly
   appreciated.
  
   
  
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
 load
   INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
 load
   INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
 load
   INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
 load
   INFO: created gap: org.apache.solr.highlight.GapFragmenter
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
 load
   INFO: created regex: org.apache.solr.highlight.RegexFragmenter
   Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
 load
   INFO: created

Re: Segment file not found error - after replicating

2009-11-09 Thread Otis Gospodnetic
It's hard to troubleshoot blindly like this, but have you tried manually 
comparing the contents of the index dir on the master and on the slave(s)?
If they are out of sync, have you tried forcing of replication to see if one of 
the subsequent replication attempts gets the dirs in sync?
Do you have more than 1 slave and do they all start having this problem at the 
same time?
Any errors in the logs for any of the scripts involved in replication in 1.3?

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Sun, November 8, 2009 10:30:44 PM
 Subject: Segment file not found error - after replicating
 
 Hi guys,
 
 We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
 environment and use the replication scripts to make replicas those live in 
 load 
 balancing slaves.
 
 The issue we face quite often (only in Linux servers) is that they tend to 
 not 
 been able to find the segment file (segment_x etc) after the replicating 
 completed. As this has become quite common, we started hitting a serious 
 issue.
 
 Below is a stack trace, if that helps and any help on this matter is greatly 
 appreciated.
 
 
 
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created gap: org.apache.solr.highlight.GapFragmenter
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created regex: org.apache.solr.highlight.RegexFragmenter
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created html: org.apache.solr.highlight.HtmlFormatter
 Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.RuntimeException: java.io.FileNotFoundException: 
 /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
 at org.apache.solr.core.SolrCore.(SolrCore.java:470)
 at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
 at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
 at 
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
 at 
 org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
 at 
 org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
 at 
 org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
 at 
 org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109

RE: Segment file not found error - after replicating

2009-11-09 Thread Maduranga Kannangara
Thanks Otis!

Yes, I checked the index directories and they are 100% same, both timestamp and 
size wise.

Not all the slaves face this issue. I would say roughly 50% has this trouble.

Logs do not have any errors too :-(

Any other things I should do/look at?

Cheers
Madu


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
Sent: Tuesday, 10 November 2009 9:26 AM
To: solr-user@lucene.apache.org
Subject: Re: Segment file not found error - after replicating

It's hard to troubleshoot blindly like this, but have you tried manually 
comparing the contents of the index dir on the master and on the slave(s)?
If they are out of sync, have you tried forcing of replication to see if one of 
the subsequent replication attempts gets the dirs in sync?
Do you have more than 1 slave and do they all start having this problem at the 
same time?
Any errors in the logs for any of the scripts involved in replication in 1.3?

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Sun, November 8, 2009 10:30:44 PM
 Subject: Segment file not found error - after replicating
 
 Hi guys,
 
 We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
 environment and use the replication scripts to make replicas those live in 
 load 
 balancing slaves.
 
 The issue we face quite often (only in Linux servers) is that they tend to 
 not 
 been able to find the segment file (segment_x etc) after the replicating 
 completed. As this has become quite common, we started hitting a serious 
 issue.
 
 Below is a stack trace, if that helps and any help on this matter is greatly 
 appreciated.
 
 
 
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created gap: org.apache.solr.highlight.GapFragmenter
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created regex: org.apache.solr.highlight.RegexFragmenter
 Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
 INFO: created html: org.apache.solr.highlight.HtmlFormatter
 Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
 SEVERE: Could not start SOLR. Check solr/home property
 java.lang.RuntimeException: java.io.FileNotFoundException: 
 /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
 at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
 at org.apache.solr.core.SolrCore.(SolrCore.java:470)
 at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
 at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
 at 
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
 at 
 org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
 at 
 org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
 at 
 org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
 at 
 org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233

Re: Segment file not found error - after replicating

2009-11-09 Thread Otis Gospodnetic
Madu,

So are you saying that all slaves have the exact same index, and that index is 
exactly the same as the one on the master, yet only some of those slaves 
exhibit this error, while others do not?  Mind listing index directories of 1) 
master 2) slave without errors, 3) slave with errors and doing:
du -s /path/to/index/on/master
du -s /path/to/index/on/slave/without/errors
du -s /path/to/index/on/slave/with/errors


Otis 
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Maduranga Kannangara mkannang...@infomedia.com.au
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Mon, November 9, 2009 7:47:04 PM
 Subject: RE: Segment file not found error - after replicating
 
 Thanks Otis!
 
 Yes, I checked the index directories and they are 100% same, both timestamp 
 and 
 size wise.
 
 Not all the slaves face this issue. I would say roughly 50% has this trouble.
 
 Logs do not have any errors too :-(
 
 Any other things I should do/look at?
 
 Cheers
 Madu
 
 
 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
 Sent: Tuesday, 10 November 2009 9:26 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Segment file not found error - after replicating
 
 It's hard to troubleshoot blindly like this, but have you tried manually 
 comparing the contents of the index dir on the master and on the slave(s)?
 If they are out of sync, have you tried forcing of replication to see if one 
 of 
 the subsequent replication attempts gets the dirs in sync?
 Do you have more than 1 slave and do they all start having this problem at 
 the 
 same time?
 Any errors in the logs for any of the scripts involved in replication in 1.3?
 
 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
 - Original Message 
  From: Maduranga Kannangara 
  To: solr-user@lucene.apache.org 
  Sent: Sun, November 8, 2009 10:30:44 PM
  Subject: Segment file not found error - after replicating
  
  Hi guys,
  
  We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
  environment and use the replication scripts to make replicas those live in 
 load 
  balancing slaves.
  
  The issue we face quite often (only in Linux servers) is that they tend to 
  not 
 
  been able to find the segment file (segment_x etc) after the replicating 
  completed. As this has become quite common, we started hitting a serious 
 issue.
  
  Below is a stack trace, if that helps and any help on this matter is 
  greatly 
  appreciated.
  
  
  
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created gap: org.apache.solr.highlight.GapFragmenter
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created regex: org.apache.solr.highlight.RegexFragmenter
  Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader 
  load
  INFO: created html: org.apache.solr.highlight.HtmlFormatter
  Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
  SEVERE: Could not start SOLR. Check solr/home property
  java.lang.RuntimeException: java.io.FileNotFoundException: 
  /solrinstances/solrhome01/data/index/segments_v (No such file or directory)
  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
  at org.apache.solr.core.SolrCore.(SolrCore.java:470)
  at 
  
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
  at 
  org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
  at 
  
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
  at 
  
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
  at 
  
 org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
  at 
  
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
  at 
  org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
  at 
  org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
  at 
  org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
  at 
  
 org.apache.catalina.manager.HTMLManagerServlet.reload

Segment file not found error - after replicating

2009-11-08 Thread Maduranga Kannangara
Hi guys,

We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
environment and use the replication scripts to make replicas those live in load 
balancing slaves.

The issue we face quite often (only in Linux servers) is that they tend to not 
been able to find the segment file (segment_x etc) after the replicating 
completed. As this has become quite common, we started hitting a serious issue.

Below is a stack trace, if that helps and any help on this matter is greatly 
appreciated.



Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created gap: org.apache.solr.highlight.GapFragmenter
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created regex: org.apache.solr.highlight.RegexFragmenter
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created html: org.apache.solr.highlight.HtmlFormatter
Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
java.lang.RuntimeException: java.io.FileNotFoundException: 
/solrinstances/solrhome01/data/index/segments_v (No such file or directory)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
at org.apache.solr.core.SolrCore.init(SolrCore.java:470)
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
at 
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:108)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
at 
org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
at 
org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
at 
org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
at 
org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.FileNotFoundException: 
/solrinstances/solrhome01/data/index/segments_v (No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
at 
org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.init(FSDirectory.java:552)
at 
org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:582)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
at