Re: Solr JVM performance issue after 2 days
Hi Thanks for suggestion. I do following changes in solrconfig.xml : ramBufferSizeMB256/ramBufferSizeMB useColdSearcherfalse/useColdSearcher maxWarmingSearchers1/maxWarmingSearchers autoCommit maxDocs2000/maxDocs maxTime30/maxTime /autoCommit lockTypesimple/lockType documentCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ filterCache class=solr.FastLRUCache size=512 initialSize=512 autowarmCount=0/ queryResultCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ after that, i see one server works fine (that includes 3 cores for 3 languages) but another server (3 cores for 3 other languages) has problem after 52 hours. I will plan to do your suggestion. i hope it helps me any better idea would be appreciated Kind Regards Hamid From: Peter Karich peat...@yahoo.de To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 8:26:01 PM Subject: Re: Solr JVM performance issue after 2 days Am 07.12.2010 13:01, schrieb Hamid Vahedi: Hi Peter Thanks a lot for reply. Actually I need real time indexing and query at the same time. Here told: You can run multiple Solr instances in separate JVMs, with both having their solr.xml configured to use the same index folder. Now Q1: I'm using Tomcat now, Could you please tell me how to have separate JVMs with Tomcat? Are you sure you don't want two servers and you really want real time? Slow down indexing + less cache should do the trick I think. I wouldn't recommend indexing AND querying on the same machine unless you have a lot RAM and CPU. you could even deploy two indices into one tomcat... the read only index refers to the data dir via: dataDir/path/to/index/data/dataDir then issue an empty (!!) commit to the read only index every minute. so that the read only index sees the changes from the feeding index. (again: see the wikipage!) setting up two tomcats on one server I woudn't recommend too, but its possible via copying tomcat into, say tomcat2 and change the shutdown and 8080 port in the tomcat2/conf/server.xml Q2:What should I set for LockType? I'm using simple, but native should also be ok. Thanks in advanced From: Peter Karichpeat...@yahoo.de To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 2:06:49 PM Subject: Re: Solr JVM performance issue after 2 days Hi Hamid, try to avoid autowarming when indexing (see solrconfig.xml: caches-autowarm + newSearcher + maxSearcher). If you need to query and indexing at the same time, then probably you'll need one read-only core and one for writing with no autowarming configured. See: http://wiki.apache.org/solr/NearRealtimeSearchTuning Or replicate from the indexing-core to a different core with different settings. Regards, Peter. Hi, I am using multi-core tomcat on 2 servers. 3 language per server. I am adding documents to solr up to 200 doc/sec. when updating process is started, every thing is fine (update performance is max 200 ms/doc. with about 800 MB memory used with minimal cpu usage). After 15-17 hours it's became so slow (more that 900 sec for update), used heap memory is about 15GB, GC time is became more than one hour. I don't know what's wrong with it? Can anyone describe me what's the problem? Is that came from Solr or JVM? Note: when i stop updating, CPU busy within 15-20 min. and when start updating again i have same issue. but when stop tomcat service and start it again, all thing is OK. I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1 thanks in advanced Hamid -- http://jetwick.com twitter search prototype
Re: Solr JVM performance issue after 2 days
Several things: 1 Your ramBufferSizeMB is probably too large. 128M is often the point of diminishing returns. Your situation may be different... 2 Your logs will show you what is happening with your autocommit properties. If you're really sending a 200 docs/second to your index your commits are happening every 10 seconds. Still too fast.. 3 I'd really, really, really recommend that you use a master/slave configuration where the slaves are your searchers and your master is the indexer. Really. You're really hammering your machine. If you separate the machines, you can turn off all of the autowarming etc on the indexer and control the frequency of slave updates. Really consider this. 4 You haven't given us any idea of the total index size. 5 I doubt separate JVMs are useful here. You're still operating on the same underlying hardware. Multiple cores are preferable to multiple JVMs almost always. Best Erick On Sun, Dec 12, 2010 at 8:26 AM, Hamid Vahedi hvb...@yahoo.com wrote: Hi Thanks for suggestion. I do following changes in solrconfig.xml : ramBufferSizeMB256/ramBufferSizeMB useColdSearcherfalse/useColdSearcher maxWarmingSearchers1/maxWarmingSearchers autoCommit maxDocs2000/maxDocs maxTime30/maxTime /autoCommit lockTypesimple/lockType documentCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ filterCache class=solr.FastLRUCache size=512 initialSize=512 autowarmCount=0/ queryResultCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ after that, i see one server works fine (that includes 3 cores for 3 languages) but another server (3 cores for 3 other languages) has problem after 52 hours. I will plan to do your suggestion. i hope it helps me any better idea would be appreciated Kind Regards Hamid From: Peter Karich peat...@yahoo.de To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 8:26:01 PM Subject: Re: Solr JVM performance issue after 2 days Am 07.12.2010 13:01, schrieb Hamid Vahedi: Hi Peter Thanks a lot for reply. Actually I need real time indexing and query at the same time. Here told: You can run multiple Solr instances in separate JVMs, with both having their solr.xml configured to use the same index folder. Now Q1: I'm using Tomcat now, Could you please tell me how to have separate JVMs with Tomcat? Are you sure you don't want two servers and you really want real time? Slow down indexing + less cache should do the trick I think. I wouldn't recommend indexing AND querying on the same machine unless you have a lot RAM and CPU. you could even deploy two indices into one tomcat... the read only index refers to the data dir via: dataDir/path/to/index/data/dataDir then issue an empty (!!) commit to the read only index every minute. so that the read only index sees the changes from the feeding index. (again: see the wikipage!) setting up two tomcats on one server I woudn't recommend too, but its possible via copying tomcat into, say tomcat2 and change the shutdown and 8080 port in the tomcat2/conf/server.xml Q2:What should I set for LockType? I'm using simple, but native should also be ok. Thanks in advanced From: Peter Karichpeat...@yahoo.de To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 2:06:49 PM Subject: Re: Solr JVM performance issue after 2 days Hi Hamid, try to avoid autowarming when indexing (see solrconfig.xml: caches-autowarm + newSearcher + maxSearcher). If you need to query and indexing at the same time, then probably you'll need one read-only core and one for writing with no autowarming configured. See: http://wiki.apache.org/solr/NearRealtimeSearchTuning Or replicate from the indexing-core to a different core with different settings. Regards, Peter. Hi, I am using multi-core tomcat on 2 servers. 3 language per server. I am adding documents to solr up to 200 doc/sec. when updating process is started, every thing is fine (update performance is max 200 ms/doc. with about 800 MB memory used with minimal cpu usage). After 15-17 hours it's became so slow (more that 900 sec for update), used heap memory is about 15GB, GC time is became more than one hour. I don't know what's wrong with it? Can anyone describe me what's the problem? Is that came from Solr or JVM? Note: when i stop updating, CPU busy within 15-20 min. and when start updating again i have same issue. but when stop tomcat service and start it again, all thing is OK. I am using tomcat 6 with 18 GB memory on windows 2008 server x64. Solr 1.4.1 thanks in advanced Hamid -- http://jetwick.com twitter search prototype
Re: Solr JVM performance issue after 2 days
Dear Erick thanks for advice Index size on all cores is 35 GB for 35 million doc (for 3 week indexing data) Kind Regards, Hamid From: Erick Erickson erickerick...@gmail.com To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 5:24:18 PM Subject: Re: Solr JVM performance issue after 2 days Several things: 1 Your ramBufferSizeMB is probably too large. 128M is often the point of diminishing returns. Your situation may be different... 2 Your logs will show you what is happening with your autocommit properties. If you're really sending a 200 docs/second to your index your commits are happening every 10 seconds. Still too fast.. 3 I'd really, really, really recommend that you use a master/slave configuration where the slaves are your searchers and your master is the indexer. Really. You're really hammering your machine. If you separate the machines, you can turn off all of the autowarming etc on the indexer and control the frequency of slave updates. Really consider this. 4 You haven't given us any idea of the total index size. 5 I doubt separate JVMs are useful here. You're still operating on the same underlying hardware. Multiple cores are preferable to multiple JVMs almost always. Best Erick On Sun, Dec 12, 2010 at 8:26 AM, Hamid Vahedi hvb...@yahoo.com wrote: Hi Thanks for suggestion. I do following changes in solrconfig.xml : ramBufferSizeMB256/ramBufferSizeMB useColdSearcherfalse/useColdSearcher maxWarmingSearchers1/maxWarmingSearchers autoCommit maxDocs2000/maxDocs maxTime30/maxTime /autoCommit lockTypesimple/lockType documentCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ filterCache class=solr.FastLRUCache size=512 initialSize=512 autowarmCount=0/ queryResultCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ after that, i see one server works fine (that includes 3 cores for 3 languages) but another server (3 cores for 3 other languages) has problem after 52 hours. I will plan to do your suggestion. i hope it helps me any better idea would be appreciated Kind Regards Hamid From: Peter Karich peat...@yahoo.de To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 8:26:01 PM Subject: Re: Solr JVM performance issue after 2 days Am 07.12.2010 13:01, schrieb Hamid Vahedi: Hi Peter Thanks a lot for reply. Actually I need real time indexing and query at the same time. Here told: You can run multiple Solr instances in separate JVMs, with both having their solr.xml configured to use the same index folder. Now Q1: I'm using Tomcat now, Could you please tell me how to have separate JVMs with Tomcat? Are you sure you don't want two servers and you really want real time? Slow down indexing + less cache should do the trick I think. I wouldn't recommend indexing AND querying on the same machine unless you have a lot RAM and CPU. you could even deploy two indices into one tomcat... the read only index refers to the data dir via: dataDir/path/to/index/data/dataDir then issue an empty (!!) commit to the read only index every minute. so that the read only index sees the changes from the feeding index. (again: see the wikipage!) setting up two tomcats on one server I woudn't recommend too, but its possible via copying tomcat into, say tomcat2 and change the shutdown and 8080 port in the tomcat2/conf/server.xml Q2:What should I set for LockType? I'm using simple, but native should also be ok. Thanks in advanced From: Peter Karichpeat...@yahoo.de To: solr-user@lucene.apache.org Sent: Tue, December 7, 2010 2:06:49 PM Subject: Re: Solr JVM performance issue after 2 days Hi Hamid, try to avoid autowarming when indexing (see solrconfig.xml: caches-autowarm + newSearcher + maxSearcher). If you need to query and indexing at the same time, then probably you'll need one read-only core and one for writing with no autowarming configured. See: http://wiki.apache.org/solr/NearRealtimeSearchTuning Or replicate from the indexing-core to a different core with different settings. Regards, Peter. Hi, I am using multi-core tomcat on 2 servers. 3 language per server. I am adding documents to solr up to 200 doc/sec. when updating process is started, every thing is fine (update performance is max 200 ms/doc. with about 800 MB memory used with minimal cpu usage). After 15-17 hours it's became so slow (more that 900 sec for update), used heap memory is about 15GB, GC time is became more than one hour. I don't know what's wrong with it? Can anyone describe me what's the problem? Is that came from Solr or JVM? Note: when i stop updating, CPU busy within 15-20 min. and
Re: SOLR geospatial
I am particularly interested in storing and querying polygons. That sort of thing looks like its on their roadmap so does anyone know what the status is on that? Also, integration with JTS would make this a core component of any GIS. Again, anyone know what the status is on that? *What’s on the roadmap of future features?* Here are some of the features and henhancements we're planning for SSP: - Performance improvements for larger data sets - Fixing of known bugs - Distance facets: Allowing Solr users to be able to filter their results based on the calculated distances. - Search with regular polygons, and groups of shapes - Integration with JTS - Highly optimized distance calculation algorithms - Ranking results by distance - 3D dimension search Adam On Sun, Dec 12, 2010 at 12:01 AM, Markus Jelsma markus.jel...@openindex.iowrote: That smells like: http://www.jteam.nl/news/spatialsolr.html My partner is using a publicly available plugin for GeoSpatial. It is used both during indexing and during search. It forms some kind of gridding system and puts 10 fields per row related to that. Doing a Radius search (vs a bounding box search which is faster in almost all cases in all GeoSpatial query systems) seems pretty fast. GeoSpatial was our project's constraint. We've moved past that now. Did I mention that it returns distance from the center of the radius based on units supplied in the query? I would tell you what the plugin is, but in our division of labor, I have kept that out of my short term memory. You can contact him at: Danilo Unite danilo.un...@gmail.com; Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: George Anthony pa...@rogers.com To: solr-user@lucene.apache.org Sent: Fri, December 10, 2010 9:23:18 AM Subject: SOLR geospatial In looking at some of the docs support for geospatial search. I see this functionality is mostly scheduled for upcoming release 4.0 (with some playing around with backported code). I note the support for the bounding box filter, but will bounding box be one of the supported *data* types for use with this filter? For example, if my lat/long data describes the footprint of a map, I'm curious if that type of coordinate data can be used by the bounding box filter (or in any other way for similar limiting/filtering capability). I see it can work with point type data but curious about functionality with bounding box type data (in contrast to simple point lat/long data). Thanks, George
Re: SOLR geospatial
By and large, spatial solr is being replaced by geospatial, see: http://wiki.apache.org/solr/SpatialSearch. I don't think the old spatial contrib is still included in the trunk or 3.x code bases, but I could be wrong That said, I don't know whether what you want is on the roadmap there either. Here's a place to start if you want to see the JIRA discussions: https://issues.apache.org/jira/browse/SOLR-1568 Best Erick On Sun, Dec 12, 2010 at 11:23 AM, Adam Estrada estrada.a...@gmail.comwrote: I am particularly interested in storing and querying polygons. That sort of thing looks like its on their roadmap so does anyone know what the status is on that? Also, integration with JTS would make this a core component of any GIS. Again, anyone know what the status is on that? *What’s on the roadmap of future features?* Here are some of the features and henhancements we're planning for SSP: - Performance improvements for larger data sets - Fixing of known bugs - Distance facets: Allowing Solr users to be able to filter their results based on the calculated distances. - Search with regular polygons, and groups of shapes - Integration with JTS - Highly optimized distance calculation algorithms - Ranking results by distance - 3D dimension search Adam On Sun, Dec 12, 2010 at 12:01 AM, Markus Jelsma markus.jel...@openindex.iowrote: That smells like: http://www.jteam.nl/news/spatialsolr.html My partner is using a publicly available plugin for GeoSpatial. It is used both during indexing and during search. It forms some kind of gridding system and puts 10 fields per row related to that. Doing a Radius search (vs a bounding box search which is faster in almost all cases in all GeoSpatial query systems) seems pretty fast. GeoSpatial was our project's constraint. We've moved past that now. Did I mention that it returns distance from the center of the radius based on units supplied in the query? I would tell you what the plugin is, but in our division of labor, I have kept that out of my short term memory. You can contact him at: Danilo Unite danilo.un...@gmail.com; Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: George Anthony pa...@rogers.com To: solr-user@lucene.apache.org Sent: Fri, December 10, 2010 9:23:18 AM Subject: SOLR geospatial In looking at some of the docs support for geospatial search. I see this functionality is mostly scheduled for upcoming release 4.0 (with some playing around with backported code). I note the support for the bounding box filter, but will bounding box be one of the supported *data* types for use with this filter? For example, if my lat/long data describes the footprint of a map, I'm curious if that type of coordinate data can be used by the bounding box filter (or in any other way for similar limiting/filtering capability). I see it can work with point type data but curious about functionality with bounding box type data (in contrast to simple point lat/long data). Thanks, George
Re: SOLR geospatial
We're in Alpha, heading to Alpha 2. Our requirements are simple: radius searching, and distance from center. Solr Spatial works and is current. GeoSpatial is almost there, but we're going to wait until it's released to spend time with it. We have other tasks to work on and don't want to be part of the debugging process of any project right now. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Erick Erickson erickerick...@gmail.com To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 11:18:03 AM Subject: Re: SOLR geospatial By and large, spatial solr is being replaced by geospatial, see: http://wiki.apache.org/solr/SpatialSearch. I don't think the old spatial contrib is still included in the trunk or 3.x code bases, but I could be wrong That said, I don't know whether what you want is on the roadmap there either. Here's a place to start if you want to see the JIRA discussions: https://issues.apache.org/jira/browse/SOLR-1568 Best Erick On Sun, Dec 12, 2010 at 11:23 AM, Adam Estrada estrada.a...@gmail.comwrote: I am particularly interested in storing and querying polygons. That sort of thing looks like its on their roadmap so does anyone know what the status is on that? Also, integration with JTS would make this a core component of any GIS. Again, anyone know what the status is on that? *What’s on the roadmap of future features?* Here are some of the features and henhancements we're planning for SSP: - Performance improvements for larger data sets - Fixing of known bugs - Distance facets: Allowing Solr users to be able to filter their results based on the calculated distances. - Search with regular polygons, and groups of shapes - Integration with JTS - Highly optimized distance calculation algorithms - Ranking results by distance - 3D dimension search Adam On Sun, Dec 12, 2010 at 12:01 AM, Markus Jelsma markus.jel...@openindex.iowrote: That smells like: http://www.jteam.nl/news/spatialsolr.html My partner is using a publicly available plugin for GeoSpatial. It is used both during indexing and during search. It forms some kind of gridding system and puts 10 fields per row related to that. Doing a Radius search (vs a bounding box search which is faster in almost all cases in all GeoSpatial query systems) seems pretty fast. GeoSpatial was our project's constraint. We've moved past that now. Did I mention that it returns distance from the center of the radius based on units supplied in the query? I would tell you what the plugin is, but in our division of labor, I have kept that out of my short term memory. You can contact him at: Danilo Unite danilo.un...@gmail.com; Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: George Anthony pa...@rogers.com To: solr-user@lucene.apache.org Sent: Fri, December 10, 2010 9:23:18 AM Subject: SOLR geospatial In looking at some of the docs support for geospatial search. I see this functionality is mostly scheduled for upcoming release 4.0 (with some playing around with backported code). I note the support for the bounding box filter, but will bounding box be one of the supported *data* types for use with this filter? For example, if my lat/long data describes the footprint of a map, I'm curious if that type of coordinate data can be used by the bounding box filter (or in any other way for similar limiting/filtering capability). I see it can work with point type data but curious about functionality with bounding box type data (in contrast to simple point lat/long data). Thanks, George
boosting, both query time and other
So, our main search results has some very common fields, 'title' 'tags' 'description' What kind of boosting has everybody been using that makes them and their customers happy with these kind of fields? What are the pros and cons of query time boosting versus configured boosting? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Very high load after replicating
After replicating an index of around 20g my slaves experience very high load (50+!!) Is there anything I can do to alleviate this problem? Would solr cloud be of any help? thanks
Re: Very high load after replicating
There can be numerous explanations such as your configuration (cache warm queries, merge factor, replication events etc) but also I/O having trouble flushing everything to disk. It could also be a memory problem, the OS might start swapping if you allocate too much RAM to the JVM leaving little for the OS to work with. You need to provide more details. After replicating an index of around 20g my slaves experience very high load (50+!!) Is there anything I can do to alleviate this problem? Would solr cloud be of any help? thanks
Re: Using synonyms in combination with facets
Thanks, this is exactly the type of solution I need. -- View this message in context: http://lucene.472066.n3.nabble.com/Using-synonyms-in-combination-with-facets-tp1968584p2074692.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: full text search in multiple fields
I went for the * operator, and it works now! Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2075140.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: boosting, both query time and other
Basically that's unanswerable, you have to look at trying various choices with your corpus. Take a look at the defaults in the dismax request handler in the example schema for a place to start... And do be aware that the correct values may change as your corpus acquires more data. I'm not sure what you're really asking when you say query time boosting versus configured boosting. Could you give an example? Best Erick On Sun, Dec 12, 2010 at 3:51 PM, Dennis Gearon gear...@sbcglobal.netwrote: So, our main search results has some very common fields, 'title' 'tags' 'description' What kind of boosting has everybody been using that makes them and their customers happy with these kind of fields? What are the pros and cons of query time boosting versus configured boosting? Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Re: SOLR geospatial
I would be more than happy to help with any of the spatial testing you are working on. adam On Sun, Dec 12, 2010 at 3:08 PM, Dennis Gearon gear...@sbcglobal.netwrote: We're in Alpha, heading to Alpha 2. Our requirements are simple: radius searching, and distance from center. Solr Spatial works and is current. GeoSpatial is almost there, but we're going to wait until it's released to spend time with it. We have other tasks to work on and don't want to be part of the debugging process of any project right now. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Erick Erickson erickerick...@gmail.com To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 11:18:03 AM Subject: Re: SOLR geospatial By and large, spatial solr is being replaced by geospatial, see: http://wiki.apache.org/solr/SpatialSearch. I don't think the old spatial contrib is still included in the trunk or 3.x code bases, but I could be wrong That said, I don't know whether what you want is on the roadmap there either. Here's a place to start if you want to see the JIRA discussions: https://issues.apache.org/jira/browse/SOLR-1568 Best Erick On Sun, Dec 12, 2010 at 11:23 AM, Adam Estrada estrada.a...@gmail.com wrote: I am particularly interested in storing and querying polygons. That sort of thing looks like its on their roadmap so does anyone know what the status is on that? Also, integration with JTS would make this a core component of any GIS. Again, anyone know what the status is on that? *What’s on the roadmap of future features?* Here are some of the features and henhancements we're planning for SSP: - Performance improvements for larger data sets - Fixing of known bugs - Distance facets: Allowing Solr users to be able to filter their results based on the calculated distances. - Search with regular polygons, and groups of shapes - Integration with JTS - Highly optimized distance calculation algorithms - Ranking results by distance - 3D dimension search Adam On Sun, Dec 12, 2010 at 12:01 AM, Markus Jelsma markus.jel...@openindex.iowrote: That smells like: http://www.jteam.nl/news/spatialsolr.html My partner is using a publicly available plugin for GeoSpatial. It is used both during indexing and during search. It forms some kind of gridding system and puts 10 fields per row related to that. Doing a Radius search (vs a bounding box search which is faster in almost all cases in all GeoSpatial query systems) seems pretty fast. GeoSpatial was our project's constraint. We've moved past that now. Did I mention that it returns distance from the center of the radius based on units supplied in the query? I would tell you what the plugin is, but in our division of labor, I have kept that out of my short term memory. You can contact him at: Danilo Unite danilo.un...@gmail.com; Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: George Anthony pa...@rogers.com To: solr-user@lucene.apache.org Sent: Fri, December 10, 2010 9:23:18 AM Subject: SOLR geospatial In looking at some of the docs support for geospatial search. I see this functionality is mostly scheduled for upcoming release 4.0 (with some playing around with backported code). I note the support for the bounding box filter, but will bounding box be one of the supported *data* types for use with this filter? For example, if my lat/long data describes the footprint of a map, I'm curious if that type of coordinate data can be used by the bounding box filter (or in any other way for similar limiting/filtering capability). I see it can work with point type data but curious about functionality with bounding box type data (in contrast to simple point lat/long data). Thanks, George
Re: [Multiple] RSS Feeds at a time...
Hi Ahmet, This is a great idea but still does not appear to be working correctly. The idea is that I want to be able to add an RSS feed and then index that feed on a schedule. My C# method looks something like this. public ActionResult Index() { try { HTTPGet req = new HTTPGet(); string solrStr = System.Configuration.ConfigurationManager.AppSettings[solrUrl].ToString(); req.Request(solrStr + /select?clean=truecommit=trueqt=/dataimportcommand=reload-config); req.Request(solrStr + /select?clean=falsecommit=trueqt=/dataimportcommand=full-import); Response.Write(req.StatusLine); Response.Write(req.ResponseTime); Response.Write(req.StatusCode); return RedirectToAction(../Import/Feeds); //return View(); } catch (SolrConnectionException) { throw new Exception(string.Format(Couldn't Import RSS Feeds)); } } My XML configuration file looks somethiing like this... dataConfig dataSource type=HttpDataSource / document entity name=filedatasource processor=FileListEntityProcessor baseDir=./solr/conf/dataimporthandler fileName=^.*xml$ recursive=true rootEntity=false dataSource=null entity name=cnn pk=link datasource=filedatasource url=http://rss.cnn.com/rss/cnn_topstories.rss; processor=XPathEntityProcessor forEach=/rss/channel | /rss/channel/item transformer=DateFormatTransformer,HTMLStripTransformer field column=source xpath=/rss/channel/title commonField=true / field column=source-link xpath=/rss/channel/link commonField=true / field column=subject xpath=/rss/channel/description commonField=true / field column=titlexpath=/rss/channel/item/title / field column=link xpath=/rss/channel/item/link / field column=description xpath=/rss/channel/item/description stripHTML=true / field column=creator xpath=/rss/channel/item/creator / field column=item-subject xpath=/rss/channel/item/subject / field column=author xpath=/rss/channel/item/author / field column=comments xpath=/rss/channel/item/comments / field column=pubdate xpath=/rss/channel/item/pubDate dateTimeFormat=-MM-dd'T'hh:mm:ss'Z' / /entity entity name=newsweek pk=link datasource=filedatasource url=http://feeds.newsweek.com/newsweek/nation; processor=XPathEntityProcessor forEach=/rss/channel | /rss/channel/item transformer=DateFormatTransformer,HTMLStripTransformer field column=source xpath=/rss/channel/title commonField=true / field column=source-link xpath=/rss/channel/link commonField=true / field column=subject xpath=/rss/channel/description commonField=true / field column=titlexpath=/rss/channel/item/title / field column=link xpath=/rss/channel/item/link / field column=description xpath=/rss/channel/item/description stripHTML=true / field column=creator xpath=/rss/channel/item/creator / field column=item-subject xpath=/rss/channel/item/subject / field column=author xpath=/rss/channel/item/author / field column=comments xpath=/rss/channel/item/comments / field column=pubdate xpath=/rss/channel/item/pubDate dateTimeFormat=-MM-dd'T'hh:mm:ss'Z'/ /entity /entity /document /dataConfig As you can see, I can add sub-entities from what appears to be as many times as I want. The idea was to reload the xml file after each entity is added. What else am I missing here because the reload-config command does not seem to be working. Any ideas would be great! Thanks, Adam Estrada On Sat, Dec 11, 2010 at 4:48 PM, Ahmet Arslan iori...@yahoo.com wrote: I found that you can have a single config file that can have several entities in it. My question now is how can I add entities without restarting the Solr service? You mean changing and re-loading xml config file? dataimport?command=reload-config http://wiki.apache.org/solr/DataImportHandler#Commands
[pubDate] is not converting correctly
All, I am having some difficulties parsing the pubDate field that is part of the RSS spec (I believe). I get the warning that states, Dec 12, 2010 6:45:26 PM org.apache.solr.handler.dataimport.DateFormatTransformer transformRow WARNING: Could not parse a Date field java.text.ParseException: Unparseable date: Thu, 30 Jul 2009 14:41:43 + at java.text.DateFormat.parse(Unknown Source) Does anyone know how to fix this? I would eventually like to do a date query but without the ability to properly parse them I don't know if it's going to work. Thanks, Adam
Re: [pubDate] is not converting correctly
(10/12/13 8:49), Adam Estrada wrote: All, I am having some difficulties parsing the pubDate field that is part of the RSS spec (I believe). I get the warning that states, Dec 12, 2010 6:45:26 PM org.apache.solr.handler.dataimport.DateFormatTransformer transformRow WARNING: Could not parse a Date field java.text.ParseException: Unparseable date: Thu, 30 Jul 2009 14:41:43 + at java.text.DateFormat.parse(Unknown Source) Does anyone know how to fix this? I would eventually like to do a date query but without the ability to properly parse them I don't know if it's going to work. Thanks, Adam Adam, How does your data-config.xml look like for that field? Have you looked at rss-data-config.xml file under example/example-DIH/solr/rss/conf directory? Koji -- http://www.rondhuit.com/en/
Which query parser and how to do full text on mulitple fields
Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=jsonindent=truestart=0rows=20q={!spatial%20lat=37.326375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art%20Loft Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Re: full text search in multiple fields
For those of us who come late to a thread, having at least the last post that you're replying to would help. Me at least ;-) Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: PeterKerk vettepa...@hotmail.com To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 1:47:35 PM Subject: Re: full text search in multiple fields I went for the * operator, and it works now! Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2075140.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Which query parser and how to do full text on mulitple fields
You said you were using a third party plugin. What do you expect people herre to know? Solr plugins don't have parameters lat, long, radius and threadCount (they have pt and dist). On Sun, Dec 12, 2010 at 4:47 PM, Dennis Gearon gear...@sbcglobal.netwrote: Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=jsonindent=truestart=0rows=20q={!spatial%20lat=37.326375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art%20Loft Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Rebuild Spellchecker based on cron expression
Hi, the spellchecker component already provides a buildOnCommit and buildOnOptimize option. Since we have several spellchecker indices building on each commit is not really what we want to do. Building on optimize is not possible as index optimization is done on the master and the slaves don't even run an optimize but only fetch the optimized index. Therefore I'm thinking about an extension of the spellchecker that allows you to rebuild the spellchecker based on a cron-expression (e.g. rebuild each night at 1 am). What do you think about this, is there anybody else interested in this? Regarding the lifecycle, is there already some executor framework or any regularly running process in place, or would I have to pull up my own thread? If so, how can I stop my thread when solr/tomcat is shutdown (I couldn't see any shutdown or destroy method in SearchComponent)? Thanx for your feedback, cheers, Martin
Re: Which query parser and how to do full text on mulitple fields
Pradeep is right, but, check the solrconfig, the query parser is defined there. Look for the basedOn attribute in the queryParser element. You said you were using a third party plugin. What do you expect people herre to know? Solr plugins don't have parameters lat, long, radius and threadCount (they have pt and dist). On Sun, Dec 12, 2010 at 4:47 PM, Dennis Gearon gear...@sbcglobal.netwrote: Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=jsonindent=truestart=0rows=20q={!spatial%20lat=37.326 375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art%20L oft Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Re: Which query parser and how to do full text on mulitple fields
Well, I didn't think the plugin would be an issue. I thought the rest of the query was from the main query parser, and the plugin processes after that. so I thought the rest of query AFTER the plugin/filter part of the query was like normal,without the filter/plugin. Is that so? Using the plugin makes me do everything according to it's reequirements, or for just what's in the braces {}? I believe the plugin is Spatial Solr, anyway. I'm really new to using this, guys. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Pradeep Singh pksing...@gmail.com To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 5:02:54 PM Subject: Re: Which query parser and how to do full text on mulitple fields You said you were using a third party plugin. What do you expect people herre to know? Solr plugins don't have parameters lat, long, radius and threadCount (they have pt and dist). On Sun, Dec 12, 2010 at 4:47 PM, Dennis Gearon gear...@sbcglobal.netwrote: Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=jsonindent=truestart=0rows=20q={!spatial%20lat=37.326375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art%20Loft t Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Re: Which query parser and how to do full text on mulitple fields
And to be more specific, the fields I want to combine for *full text* are just three text fields, they're not geospatial. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Pradeep Singh pksing...@gmail.com To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 5:02:54 PM Subject: Re: Which query parser and how to do full text on mulitple fields You said you were using a third party plugin. What do you expect people herre to know? Solr plugins don't have parameters lat, long, radius and threadCount (they have pt and dist). On Sun, Dec 12, 2010 at 4:47 PM, Dennis Gearon gear...@sbcglobal.netwrote: Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=jsonindent=truestart=0rows=20q={!spatial%20lat=37.326375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art%20Loft t Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Re: Rebuild Spellchecker based on cron expression
Maybe you've overlooked the build parameter? http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build Hi, the spellchecker component already provides a buildOnCommit and buildOnOptimize option. Since we have several spellchecker indices building on each commit is not really what we want to do. Building on optimize is not possible as index optimization is done on the master and the slaves don't even run an optimize but only fetch the optimized index. Therefore I'm thinking about an extension of the spellchecker that allows you to rebuild the spellchecker based on a cron-expression (e.g. rebuild each night at 1 am). What do you think about this, is there anybody else interested in this? Regarding the lifecycle, is there already some executor framework or any regularly running process in place, or would I have to pull up my own thread? If so, how can I stop my thread when solr/tomcat is shutdown (I couldn't see any shutdown or destroy method in SearchComponent)? Thanx for your feedback, cheers, Martin
Re: Which query parser and how to do full text on mulitple fields
Oh, I didn't know that the syntax didn't show the parser used, that it was set in the config file. I'll talk to my partner, thanks. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Markus Jelsma markus.jel...@openindex.io To: solr-user@lucene.apache.org Cc: Pradeep Singh pksing...@gmail.com Sent: Sun, December 12, 2010 5:08:11 PM Subject: Re: Which query parser and how to do full text on mulitple fields Pradeep is right, but, check the solrconfig, the query parser is defined there. Look for the basedOn attribute in the queryParser element. You said you were using a third party plugin. What do you expect people herre to know? Solr plugins don't have parameters lat, long, radius and threadCount (they have pt and dist). On Sun, Dec 12, 2010 at 4:47 PM, Dennis Gearon gear...@sbcglobal.netwrote: Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=jsonindent=truestart=0rows=20q={!spatial%20lat=37.326 375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art%20L oft Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Re: Which query parser and how to do full text on mulitple fields
The manual answers most questions. Oh, I didn't know that the syntax didn't show the parser used, that it was set in the config file. I'll talk to my partner, thanks. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die. - Original Message From: Markus Jelsma markus.jel...@openindex.io To: solr-user@lucene.apache.org Cc: Pradeep Singh pksing...@gmail.com Sent: Sun, December 12, 2010 5:08:11 PM Subject: Re: Which query parser and how to do full text on mulitple fields Pradeep is right, but, check the solrconfig, the query parser is defined there. Look for the basedOn attribute in the queryParser element. You said you were using a third party plugin. What do you expect people herre to know? Solr plugins don't have parameters lat, long, radius and threadCount (they have pt and dist). On Sun, Dec 12, 2010 at 4:47 PM, Dennis Gearon gear...@sbcglobal.netwrote: Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=jsonindent=truestart=0rows=20q={!spatial%20lat=37.3 26 375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art% 20L oft Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so you do not have to make them yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036' EARTH has a Right To Life, otherwise we all die.
Re: Rebuild Spellchecker based on cron expression
On Mon, Dec 13, 2010 at 2:12 AM, Markus Jelsma markus.jel...@openindex.io wrote: Maybe you've overlooked the build parameter? http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build I'm aware of this, but we don't want to maintain cron-jobs on all slaves for all spellcheckers for all cores. That's why I'm thinking about a more integrated solution. Or did I really overlook s.th.? Cheers, Martin Hi, the spellchecker component already provides a buildOnCommit and buildOnOptimize option. Since we have several spellchecker indices building on each commit is not really what we want to do. Building on optimize is not possible as index optimization is done on the master and the slaves don't even run an optimize but only fetch the optimized index. Therefore I'm thinking about an extension of the spellchecker that allows you to rebuild the spellchecker based on a cron-expression (e.g. rebuild each night at 1 am). What do you think about this, is there anybody else interested in this? Regarding the lifecycle, is there already some executor framework or any regularly running process in place, or would I have to pull up my own thread? If so, how can I stop my thread when solr/tomcat is shutdown (I couldn't see any shutdown or destroy method in SearchComponent)? Thanx for your feedback, cheers, Martin -- Martin Grotzke http://twitter.com/martin_grotzke
Re: [pubDate] is not converting correctly
Thanks for the feedback! There are quite a few formats that can be used. I am experiencing at least 5 of them. Would something like this work? Note that there are 2 different formats separated by a comma. field column=pubdate xpath=/rss/channel/item/pubDate dateTimeFormat=EEE, dd MMM HH:mm:ss zzz, -MM-dd'T'HH:mm:ss'Z' / I don't suppose it will because there is already a comma in the first parser. I guess I am reallly looking for an all purpose data time parser but even if I have that, would I still be able to query *all* fields in the index? Good article: http://www.java2s.com/Open-Source/Java-Document/RSS-RDF/Rome/com/sun/syndication/io/impl/DateParser.java.htm Adam On Sun, Dec 12, 2010 at 7:31 PM, Koji Sekiguchi k...@r.email.ne.jp wrote: (10/12/13 8:49), Adam Estrada wrote: All, I am having some difficulties parsing the pubDate field that is part of the? RSS spec (I believe). I get the warning that states, Dec 12, 2010 6:45:26 PM org.apache.solr.handler.dataimport.DateFormatTransformer transformRow WARNING: Could not parse a Date field java.text.ParseException: Unparseable date: Thu, 30 Jul 2009 14:41:43 + at java.text.DateFormat.parse(Unknown Source) Does anyone know how to fix this? I would eventually like to do a date query but without the ability to properly parse them I don't know if it's going to work. Thanks, Adam Adam, How does your data-config.xml look like for that field? Have you looked at rss-data-config.xml file under example/example-DIH/solr/rss/conf directory? Koji -- http://www.rondhuit.com/en/
Re: [Multiple] RSS Feeds at a time...
What else am I missing here because the reload-config command does not seem to be working. Any ideas would be great! solr/dataimport?command=reload-config should return the message str name=importResponseConfiguration Re-loaded sucessfully/str if everything went well. May be you can check that after each reload. May be it is not a valid xml? By the way, can't you use variable resolver in your case? http://wiki.apache.org/solr/DataImportHandler#A_VariableResolver Passing different rss URLs using a custom parameter from request like, ${dataimporter.request.myrssurl}. /dataimport?command=full-importclean=falsemyrssurl=http://rss.cnn.com/rss/cnn_topstories.rss Similar discussion http://search-lucene.com/m/xILqvbY6h91/
Re: Rebuild Spellchecker based on cron expression
I'm shooting in the dark here, but according to this: http://wiki.apache.org/solr/SolrReplication http://wiki.apache.org/solr/SolrReplicationafter the slave pulls the index down, it issues a commit. So if your slave is configured to generate the dictionary on commit, will it just happen? But according to this: https://issues.apache.org/jira/browse/SOLR-866 https://issues.apache.org/jira/browse/SOLR-866this is an open issue Best Erick On Sun, Dec 12, 2010 at 8:30 PM, Martin Grotzke martin.grot...@googlemail.com wrote: On Mon, Dec 13, 2010 at 2:12 AM, Markus Jelsma markus.jel...@openindex.io wrote: Maybe you've overlooked the build parameter? http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build I'm aware of this, but we don't want to maintain cron-jobs on all slaves for all spellcheckers for all cores. That's why I'm thinking about a more integrated solution. Or did I really overlook s.th.? Cheers, Martin Hi, the spellchecker component already provides a buildOnCommit and buildOnOptimize option. Since we have several spellchecker indices building on each commit is not really what we want to do. Building on optimize is not possible as index optimization is done on the master and the slaves don't even run an optimize but only fetch the optimized index. Therefore I'm thinking about an extension of the spellchecker that allows you to rebuild the spellchecker based on a cron-expression (e.g. rebuild each night at 1 am). What do you think about this, is there anybody else interested in this? Regarding the lifecycle, is there already some executor framework or any regularly running process in place, or would I have to pull up my own thread? If so, how can I stop my thread when solr/tomcat is shutdown (I couldn't see any shutdown or destroy method in SearchComponent)? Thanx for your feedback, cheers, Martin -- Martin Grotzke http://twitter.com/martin_grotzke
PDFBOX 1.3.1 Parsing Error
hi All, While using PDFBOX 1.3.1 in APACHE TIKA 1.7 i am getting the following error to parse an PDF Document. *Error: Expected an integer type, actual='' at org.apache.pdfbox.pdfparser.BaseParser.readInt* * * This error occurs, because of SHA-256 Encryption used by Adobe Acrobat 9. is there is any solution to this problem??? I get stuck because of this approoach. In Jira Issue-697 has been created against this. https://issues.apache.org/jira/browse/PDFBOX-697 Please help!! / Pankaj Bhatt.
Re: PDFBOX 1.3.1 Parsing Error
If the document is encrypted maybe it isn't meant to be indexed and publicly visible after all? On Sun, Dec 12, 2010 at 10:22 PM, pankaj bhatt panbh...@gmail.com wrote: hi All, While using PDFBOX 1.3.1 in APACHE TIKA 1.7 i am getting the following error to parse an PDF Document. *Error: Expected an integer type, actual='' at org.apache.pdfbox.pdfparser.BaseParser.readInt* * * This error occurs, because of SHA-256 Encryption used by Adobe Acrobat 9. is there is any solution to this problem??? I get stuck because of this approoach. In Jira Issue-697 has been created against this. https://issues.apache.org/jira/browse/PDFBOX-697 Please help!! / Pankaj Bhatt.
Re: Rebuild Spellchecker based on cron expression
Hi, when thinking further about it it's clear that https://issues.apache.org/jira/browse/SOLR-433 would be even better - we could generate the spellechecker indices on commit/optimize on the master and replicate them to all slaves. Just wondering what's the reason that this patch receives that little interest. Anything wrong with it? Cheers, Martin On Mon, Dec 13, 2010 at 2:04 AM, Martin Grotzke martin.grot...@googlemail.com wrote: Hi, the spellchecker component already provides a buildOnCommit and buildOnOptimize option. Since we have several spellchecker indices building on each commit is not really what we want to do. Building on optimize is not possible as index optimization is done on the master and the slaves don't even run an optimize but only fetch the optimized index. Therefore I'm thinking about an extension of the spellchecker that allows you to rebuild the spellchecker based on a cron-expression (e.g. rebuild each night at 1 am). What do you think about this, is there anybody else interested in this? Regarding the lifecycle, is there already some executor framework or any regularly running process in place, or would I have to pull up my own thread? If so, how can I stop my thread when solr/tomcat is shutdown (I couldn't see any shutdown or destroy method in SearchComponent)? Thanx for your feedback, cheers, Martin -- Martin Grotzke http://www.javakaffee.de/blog/