Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr process crash and restart
Tom: i don't know enough about hte HDFS code to fully understand what's going on here, but based on your description of the problem it definitely smells like a bug, so i've opened an issue ot make sure we don't lose track of it... https://issues.apache.org/jira/browse/SOLR-6367 : Date: Fri, 1 Aug 2014 10:45:36 -0400 : From: Tom Chen tomchen1...@gmail.com : Reply-To: dev@lucene.apache.org : To: dev@lucene.apache.org : Subject: Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr : process crash and restart : : I wonder if there's any update on this. Should we create a JIRA to track : this? : : Thanks, : Tom : : : On Mon, Jul 21, 2014 at 12:18 PM, Mark Miller markrmil...@gmail.com wrote: : : It’s on my list to investigate. : : -- : Mark Miller : about.me/markrmiller : : On July 21, 2014 at 10:26:09 AM, Tom Chen (tomchen1...@gmail.com) wrote: : Any thought about this issue: Solr on HDFS generate empty tlog when add : documents without commit. : : Thanks, : Tom : : : On Fri, Jul 18, 2014 at 12:21 PM, Tom Chen wrote: : :Hi, : :This seems a bug for Solr running on HDFS. : :Reproduce steps: :1) Setup Solr to run on HDFS like this: : :java -Dsolr.directoryFactory=HdfsDirectoryFactory :-Dsolr.lock.type=hdfs :-Dsolr.hdfs.home=hdfs://host:port/path : :For the purpose of this testing, turn off the default auto commit in :solrconfig.xml, i.e. comment out autoCommit like this: : : :2) Add a document without commit: :curl http://localhost:8983/solr/collection1/update?commit=false; -H :Content-type:text/xml; charset=utf-8 --data-binary @solr.xml : :3) Solr generate empty tlog file (0 file size, the last one ends with : 6): :[hadoop@hdtest042 exampledocs]$ hadoop fs -ls :/path/collection1/core_node1/data/tlog :Found 5 items :-rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 :/path/collection1/core_node1/data/tlog/tlog.001 :-rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47 :/path/collection1/core_node1/data/tlog/tlog.003 :-rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 :/path/collection1/core_node1/data/tlog/tlog.004 :-rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 :/path/collection1/core_node1/data/tlog/tlog.005 :-rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 :/path/collection1/core_node1/data/tlog/tlog.006 : :4) Simulate Solr crash by killing the process with -9 option. : :5) restart the Solr process. Observation is that uncommitted document : are :not replayed, files in tlog directory are cleaned up. Hence uncommitted :document(s) is lost. : :Am I missing anything or this is a bug? : :BTW, additional observations: :a) If in step 4) Solr is stopped gracefully (i.e. without -9 option), :non-empty tlog file is geneated and after re-starting Solr, uncommitted :document is replayed as expected. : :b) If Solr doesn't run on HDFS (i.e. on local file system), this issue : is :not observed either. : :Thanks, :Tom : : : : : - : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org : For additional commands, e-mail: dev-h...@lucene.apache.org : : : -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr process crash and restart
I wonder if there's any update on this. Should we create a JIRA to track this? Thanks, Tom On Mon, Jul 21, 2014 at 12:18 PM, Mark Miller markrmil...@gmail.com wrote: It’s on my list to investigate. -- Mark Miller about.me/markrmiller On July 21, 2014 at 10:26:09 AM, Tom Chen (tomchen1...@gmail.com) wrote: Any thought about this issue: Solr on HDFS generate empty tlog when add documents without commit. Thanks, Tom On Fri, Jul 18, 2014 at 12:21 PM, Tom Chen wrote: Hi, This seems a bug for Solr running on HDFS. Reproduce steps: 1) Setup Solr to run on HDFS like this: java -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://host:port/path For the purpose of this testing, turn off the default auto commit in solrconfig.xml, i.e. comment out autoCommit like this: 2) Add a document without commit: curl http://localhost:8983/solr/collection1/update?commit=false; -H Content-type:text/xml; charset=utf-8 --data-binary @solr.xml 3) Solr generate empty tlog file (0 file size, the last one ends with 6): [hadoop@hdtest042 exampledocs]$ hadoop fs -ls /path/collection1/core_node1/data/tlog Found 5 items -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.001 -rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.003 -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.004 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.005 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.006 4) Simulate Solr crash by killing the process with -9 option. 5) restart the Solr process. Observation is that uncommitted document are not replayed, files in tlog directory are cleaned up. Hence uncommitted document(s) is lost. Am I missing anything or this is a bug? BTW, additional observations: a) If in step 4) Solr is stopped gracefully (i.e. without -9 option), non-empty tlog file is geneated and after re-starting Solr, uncommitted document is replayed as expected. b) If Solr doesn't run on HDFS (i.e. on local file system), this issue is not observed either. Thanks, Tom - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr process crash and restart
Any thought about this issue: Solr on HDFS generate empty tlog when add documents without commit. Thanks, Tom On Fri, Jul 18, 2014 at 12:21 PM, Tom Chen tomchen1...@gmail.com wrote: Hi, This seems a bug for Solr running on HDFS. Reproduce steps: 1) Setup Solr to run on HDFS like this: java -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://host:port/path For the purpose of this testing, turn off the default auto commit in solrconfig.xml, i.e. comment out autoCommit like this: !-- autoCommit maxTime${solr.autoCommit.maxTime:15000}/maxTime openSearcherfalse/openSearcher /autoCommit -- 2) Add a document without commit: curl http://localhost:8983/solr/collection1/update?commit=false; -H Content-type:text/xml; charset=utf-8 --data-binary @solr.xml 3) Solr generate empty tlog file (0 file size, the last one ends with 6): [hadoop@hdtest042 exampledocs]$ hadoop fs -ls /path/collection1/core_node1/data/tlog Found 5 items -rw-r--r-- 1 hadoop hadoop667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.001 -rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.003 -rw-r--r-- 1 hadoop hadoop667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.004 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.005 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.006 4) Simulate Solr crash by killing the process with -9 option. 5) restart the Solr process. Observation is that uncommitted document are not replayed, files in tlog directory are cleaned up. Hence uncommitted document(s) is lost. Am I missing anything or this is a bug? BTW, additional observations: a) If in step 4) Solr is stopped gracefully (i.e. without -9 option), non-empty tlog file is geneated and after re-starting Solr, uncommitted document is replayed as expected. b) If Solr doesn't run on HDFS (i.e. on local file system), this issue is not observed either. Thanks, Tom
Re: SolrCloud on HDFS empty tlog hence doesn't replay after Solr process crash and restart
It’s on my list to investigate. -- Mark Miller about.me/markrmiller On July 21, 2014 at 10:26:09 AM, Tom Chen (tomchen1...@gmail.com) wrote: Any thought about this issue: Solr on HDFS generate empty tlog when add documents without commit. Thanks, Tom On Fri, Jul 18, 2014 at 12:21 PM, Tom Chen wrote: Hi, This seems a bug for Solr running on HDFS. Reproduce steps: 1) Setup Solr to run on HDFS like this: java -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://host:port/path For the purpose of this testing, turn off the default auto commit in solrconfig.xml, i.e. comment out autoCommit like this: 2) Add a document without commit: curl http://localhost:8983/solr/collection1/update?commit=false; -H Content-type:text/xml; charset=utf-8 --data-binary @solr.xml 3) Solr generate empty tlog file (0 file size, the last one ends with 6): [hadoop@hdtest042 exampledocs]$ hadoop fs -ls /path/collection1/core_node1/data/tlog Found 5 items -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.001 -rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.003 -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.004 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.005 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.006 4) Simulate Solr crash by killing the process with -9 option. 5) restart the Solr process. Observation is that uncommitted document are not replayed, files in tlog directory are cleaned up. Hence uncommitted document(s) is lost. Am I missing anything or this is a bug? BTW, additional observations: a) If in step 4) Solr is stopped gracefully (i.e. without -9 option), non-empty tlog file is geneated and after re-starting Solr, uncommitted document is replayed as expected. b) If Solr doesn't run on HDFS (i.e. on local file system), this issue is not observed either. Thanks, Tom - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org