Re: Commit Issue in Solr 3.4
Yes it is amazon ec2 indeed. To expqnd on that, This solr deployment was working fine, handling the same load, on a 34 GB instance on ebs storage for quite some time. To reduce the time taken by a commit, I shifted this to a 30 GB SSD instance. It performed better in writes and commits for sure. But, since the last week I started facing this problem of infinite back to back commits. Not being able to resolve this, I have finally switched back to a 34 GB machine with ebs storage, and now the commits are working fine, though slow. Any thoughts? On 6 Feb 2014 23:00, Shawn Heisey s...@elyograg.org wrote: On 2/6/2014 9:56 AM, samarth s wrote: Size of index = 260 GB Total Docs = 100mn Usual writing speed = 50K per hour autoCommit-maxDocs = 400,000 autoCommit-maxTime = 1500,000 (25 mins) merge factor = 10 M/c memory = 30 GB, Xmx = 20 GB Server - Jetty OS - Cent OS 6 With 30GB of RAM (is it Amazon EC2, by chance?) and a 20GB heap, you have about 10GB of RAM left for caching your Solr index. If that server has all 260GB of index, I am really surprised that you have only been having problems for a short time. I would have expected problems from day one. Even if it only has half or one quarter of the index, there is still a major discrepancy in RAM vs. index size. You either need more memory or you need to reduce the size of your index. The size of the indexed portion generally has more of an impact on performance than the size of the stored portion, but they do both have an impact, especially on indexing and committing. With regular disks, it's best to have at least 50% of your index size available to the OS disk cache, but 100% is better. http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache If you are already using SSD, you might think there can't be memory-related performance problems ... but you still need a pretty significant chunk of disk cache. https://wiki.apache.org/solr/SolrPerformanceProblems#SSD Thanks, Shawn
Commit Issue in Solr 3.4
Hi, I have been using the solr version 3.4 in a project for about more than a year. It is only now that I have started facing a weird problem of never ending back to back commit cycles. I can say this looking at the InfoStream logs, that, as soon as one commit cycle is done with another one almost immediately spawns. My writer processes, which use solrj as the client, do not get a chance to write even a single document between these commits. I have waited for hours to let these commits take its own course and get over in a natural way, but they dont. Finally, I had to restart the solr server. Post that, my writers could get away with writing a few thousand docs, after which the same infinite commit cycles start. Could not find any related JIRA on this. Size of index = 260 GB Total Docs = 100mn Usual writing speed = 50K per hour autoCommit-maxDocs = 400,000 autoCommit-maxTime = 1500,000 (25 mins) merge factor = 10 M/c memory = 30 GB, Xmx = 20 GB Server - Jetty OS - Cent OS 6 Please let me know if any other details are needed on the setup. Any help is highly appreciated. Thanks. -- Regards, Samarth
Sum as a Projection for Facet Queries
Hi, We have a need of finding the sum of a field for each facet.query. We have looked at StatsComponent http://wiki.apache.org/solr/StatsComponent but that supports only facet.field. Has anyone written a patch over StatsComponent that supports the same along with some performance measures? Is there any way we can do this using the Function Query - Sumhttp://wiki.apache.org/solr/FunctionQuery#sum ? -- Regards, Samarth
Re: updateLog in Solr 4.2
I have a similar problem on this one. The reason for this is my application performs back to back updates. And, as came out of my performance tests, the update immediately after the first one, seems to be a lot slower than as compared to not having any update logs. Is this a genuine case, or did I miss something in my performance tests ? Any pointers on this one are highly appreciated. On Fri, Apr 12, 2013 at 6:47 PM, vicky desai vicky.de...@germinait.comwrote: If i disable update log in solr 4.2 then i get the following exception SEVERE: :java.lang.NullPointerException at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935) at org.apache.solr.cloud.ZkController.register(ZkController.java:761) at org.apache.solr.cloud.ZkController.register(ZkController.java:727) at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908) at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892) at org.apache.solr.core.CoreContainer.register(CoreContainer.java:841) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Apr 12, 2013 6:39:56 PM org.apache.solr.common.SolrException log SEVERE: null:org.apache.solr.common.cloud.ZooKeeperException: at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:931) at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892) at org.apache.solr.core.CoreContainer.register(CoreContainer.java:841) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935) at org.apache.solr.cloud.ZkController.register(ZkController.java:761) at org.apache.solr.cloud.ZkController.register(ZkController.java:727) at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908) ... 12 more and solr fails to start . However if i add updatelog in my solrconfig.xml it starts. Is the update log parameter mandatory for solr4.2 -- View this message in context: http://lucene.472066.n3.nabble.com/updateLog-in-Solr-4-2-tp4055548.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Samarth
Re: Error on using the projection parameter - fl - in Solr 4
Thanks Erik. Will take your advice as a long term solution. Currently working around by using the the regex capability added in parsing the 'fl' parameter, using 'fl=E_*' On Wed, Jan 9, 2013 at 6:07 AM, Erick Erickson erickerick...@gmail.comwrote: You really have a field name with '@' symbols in it? If it worked in 3.6, it was probably not intentional, classic undocumented behavior. The first thing I'd try is replacing the @ with __ in my schema... Best Erick On Tue, Jan 8, 2013 at 6:58 AM, samarth s samarth.s.seksa...@gmail.com wrote: q=*:*fl=E_abc@@xyz -- Regards, Samarth
Error on using the projection parameter - fl - in Solr 4
Hi all, I am in a process of migrating my application from Solr 3.6 to Solr 4. A query that used to work is giving an error with Solr 4. The query looks like: q=*:*fl=E_abc@@xyz The error displayed on the admin page is: can not use FieldCache on multivalued field: E_abc The field printed in the error has dropped the part after the character '@'. Could not find any useful pointers on the forums, except one that has a similar issue but while using the 'qt' parameter. Reference to this chain is: Subject: multivalued filed question (FieldCache error) on solr-user forums Thanks for any pointers. -- Regards, Samarth
Re: Atomicity of commits (soft OR hard) across replicas - Solr Cloud
Thanks *Tomás !! *This was useful. On Mon, Dec 31, 2012 at 6:03 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote: If by cronned commit you mean auto-commit: auto-commits are local to each node, are not distributed, so there is no something like a cluster-wide atomicity there. The commit may be performed in one node now, and in other nodes in 5 minutes (depending on the maxTime you have configured). If you mean that you are issuing commits from outside Solr, those are going to be by default distributed to all the nodes. The operation will succeed only if all nodes succeed, but if one of the nodes fail, the operation will fail. However, the nodes that did succeed WILL have a new view of the index at this point. (I'm not sure if something is done in this situation with the failing node). The local commit operation in one node *is* atomic. Tomás On Mon, Dec 31, 2012 at 7:04 AM, samarth s samarth.s.seksa...@gmail.com wrote: Tried reading articles online, but could not find one that confirmed the same 100% :). Does a cronned soft commit complete its commit cycle only after all the replicas have the newest data visible ? -- Regards, Samarth -- Regards, Samarth
Solr cloud in 4.0 with NRT performance
Hi, I am currently using features like facet and group/collapse on solr 3.6. The frequency of writing is user driven, and hence is expected to be visible real time or at least near real time. These updates should be consistent in facet and group results as well. Also to handle the query load, I may have to use replication/sharding w/ or w/o solr cloud. I am planning to migrate to solr 4.0, and use its powerful features of NRT ( soft commit ) and Solr Cloud ( using Zookeeper ) to achieve the above requirements. Is a Solr Cloud with a replication level greater than 1, capable of giving NRT results ? If yes, do these NRT results work with all kinds of querying, like, faceting and grouping ? It would be great if some one could share their insights and numbers on these questions. -- Regards, Samarth
Re: Solr 4.0 Beta Release
Thanks Jack. On Wed, Sep 12, 2012 at 8:08 PM, Jack Krupansky j...@basetechnology.comwrote: Yes, it has been released. Read the details here (including download instructions/links): http://lucene.apache.org/solr/**solrnews.htmlhttp://lucene.apache.org/solr/solrnews.html -- Jack Krupansky -Original Message- From: samarth s Sent: Wednesday, September 12, 2012 9:54 AM To: solr-user@lucene.apache.org Subject: Solr 4.0 Beta Release Hi All, Would just like to verify if Solr 4.0 Beta has been released. Does the following url give the official beta release: http://www.apache.org/dyn/**closer.cgi/lucene/solr/4.0.0-**BETAhttp://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA -- Regards, Samarth -- Regards, Samarth
Solr 4.0 Beta Release
Hi All, Would just like to verify if Solr 4.0 Beta has been released. Does the following url give the official beta release: http://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA -- Regards, Samarth
Re: Too many connections in CLOSE_WAIT state on master solr server
Hi Ranveer, You can try this '-Dhttp.maxConnections' out, may resolve the issue. But the root cause I figured may lie with some queries made to solr that are too heavy to have decent turnaround times. As a result the client may close the connection abruptly, resulting in half closed connections. You can also try adding search time out to solr queries: https://issues.apache.org/jira/browse/SOLR-502 On Tue, Jan 10, 2012 at 8:06 AM, Ranveer ranveer.s...@gmail.com wrote: Hi, I am facing same problem. Did -Dhttp.maxConnections resolve the problem ? Please let us know! regards Ranveer On Thursday 15 December 2011 11:30 AM, samarth s wrote: Thanks Erick and Mikhail. I'll try this out. On Wed, Dec 14, 2011 at 7:11 PM, Erick Ericksonerickerick...@gmail.com wrote: I'm guessing (and it's just a guess) that what's happening is that the container is queueing up your requests while waiting for the other connections to close, so Mikhail's suggestion seems like a good idea. Best Erick On Wed, Dec 14, 2011 at 12:28 AM, samarth s samarth.s.seksa...@gmail.com wrote: The updates to the master are user driven, and are needed to be visible quickly. Hence, the high frequency of replication. It may be that too many replication requests are being handled at a time, but why should that result in half closed connections? On Wed, Dec 14, 2011 at 2:47 AM, Erick Ericksonerickerick...@gmail.com wrote: Replicating 40 cores every 20 seconds is just *asking* for trouble. How often do your cores change on the master? How big are they? Is there any chance you just have too many cores replicating at once? Best Erick On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: You can try to reuse your connections (prevent them from closing) by specifying -Dhttp.maxConnections=http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.htmlN in jvm startup params. At client JVM!. Number should be chosen considering the number of connection you'd like to keep alive. Let me know if it works for you. On Tue, Dec 13, 2011 at 2:57 PM, samarth ssamarth.s.seksa...@gmail.comwrote: Hi, I am using solr replication and am experiencing a lot of connections in the state CLOSE_WAIT at the master solr server. These disappear after a while, but till then the master solr stops responding. There are about 130 open connections on the master server with the client as the slave m/c and all are in the state CLOSE_WAIT. Also, the client port specified on the master solr server netstat results is not visible in the netstat results on the client (slave solr) m/c. Following is my environment: - 40 cores in the master solr on m/c 1 - 40 cores in the slave solr on m/c 2 - The replication poll interval is 20 seconds. - Replication part in solrconfig.xml in the slave solr: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave !--fully qualified url for the replication handler of master-- str name=masterUrl$mastercorename/replication/str !--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. But a fetchindex can be triggered from the admin or the http API-- str name=pollInterval00:00:20/str !-- The following values are used when the slave connects to the master to download the index files. Default values implicitly set as 5000ms and 1ms respectively. The user DOES NOT need to specify these unless the bandwidth is extremely low or if there is an extremely high latency-- str name=httpConnTimeout5000/str str name=httpReadTimeout1/str /lst /requestHandler Thanks for any pointers. -- Regards, Samarth -- Sincerely yours Mikhail Khludnev Developer Grid Dynamics tel. 1-415-738-8644 Skype: mkhludnev http://www.griddynamics.com mkhlud...@griddynamics.com -- Regards, Samarth -- Regards, Samarth
Request Timeout Parameter in update queries
Hi, Does an update query to solr work well when sent with a timeout parameter ? https://issues.apache.org/jira/browse/SOLR-502 For example, consider an update query was fired with a timeout of 30 seconds, and the request got aborted half way due to the timeout. Can this corrupt the index in any way ? -- Regards, Samarth
Re: Too many connections in CLOSE_WAIT state on master solr server
Thanks Erick and Mikhail. I'll try this out. On Wed, Dec 14, 2011 at 7:11 PM, Erick Erickson erickerick...@gmail.com wrote: I'm guessing (and it's just a guess) that what's happening is that the container is queueing up your requests while waiting for the other connections to close, so Mikhail's suggestion seems like a good idea. Best Erick On Wed, Dec 14, 2011 at 12:28 AM, samarth s samarth.s.seksa...@gmail.com wrote: The updates to the master are user driven, and are needed to be visible quickly. Hence, the high frequency of replication. It may be that too many replication requests are being handled at a time, but why should that result in half closed connections? On Wed, Dec 14, 2011 at 2:47 AM, Erick Erickson erickerick...@gmail.com wrote: Replicating 40 cores every 20 seconds is just *asking* for trouble. How often do your cores change on the master? How big are they? Is there any chance you just have too many cores replicating at once? Best Erick On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: You can try to reuse your connections (prevent them from closing) by specifying -Dhttp.maxConnections=http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.htmlN in jvm startup params. At client JVM!. Number should be chosen considering the number of connection you'd like to keep alive. Let me know if it works for you. On Tue, Dec 13, 2011 at 2:57 PM, samarth s samarth.s.seksa...@gmail.comwrote: Hi, I am using solr replication and am experiencing a lot of connections in the state CLOSE_WAIT at the master solr server. These disappear after a while, but till then the master solr stops responding. There are about 130 open connections on the master server with the client as the slave m/c and all are in the state CLOSE_WAIT. Also, the client port specified on the master solr server netstat results is not visible in the netstat results on the client (slave solr) m/c. Following is my environment: - 40 cores in the master solr on m/c 1 - 40 cores in the slave solr on m/c 2 - The replication poll interval is 20 seconds. - Replication part in solrconfig.xml in the slave solr: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave !--fully qualified url for the replication handler of master-- str name=masterUrl$mastercorename/replication/str !--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. But a fetchindex can be triggered from the admin or the http API-- str name=pollInterval00:00:20/str !-- The following values are used when the slave connects to the master to download the index files. Default values implicitly set as 5000ms and 1ms respectively. The user DOES NOT need to specify these unless the bandwidth is extremely low or if there is an extremely high latency-- str name=httpConnTimeout5000/str str name=httpReadTimeout1/str /lst /requestHandler Thanks for any pointers. -- Regards, Samarth -- Sincerely yours Mikhail Khludnev Developer Grid Dynamics tel. 1-415-738-8644 Skype: mkhludnev http://www.griddynamics.com mkhlud...@griddynamics.com -- Regards, Samarth -- Regards, Samarth
Too many connections in CLOSE_WAIT state on master solr server
Hi, I am using solr replication and am experiencing a lot of connections in the state CLOSE_WAIT at the master solr server. These disappear after a while, but till then the master solr stops responding. There are about 130 open connections on the master server with the client as the slave m/c and all are in the state CLOSE_WAIT. Also, the client port specified on the master solr server netstat results is not visible in the netstat results on the client (slave solr) m/c. Following is my environment: - 40 cores in the master solr on m/c 1 - 40 cores in the slave solr on m/c 2 - The replication poll interval is 20 seconds. - Replication part in solrconfig.xml in the slave solr: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave !--fully qualified url for the replication handler of master-- str name=masterUrl$mastercorename/replication/str !--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. But a fetchindex can be triggered from the admin or the http API-- str name=pollInterval00:00:20/str !-- The following values are used when the slave connects to the master to download the index files. Default values implicitly set as 5000ms and 1ms respectively. The user DOES NOT need to specify these unless the bandwidth is extremely low or if there is an extremely high latency-- str name=httpConnTimeout5000/str str name=httpReadTimeout1/str /lst /requestHandler Thanks for any pointers. -- Regards, Samarth
Re: Too many connections in CLOSE_WAIT state on master solr server
The updates to the master are user driven, and are needed to be visible quickly. Hence, the high frequency of replication. It may be that too many replication requests are being handled at a time, but why should that result in half closed connections? On Wed, Dec 14, 2011 at 2:47 AM, Erick Erickson erickerick...@gmail.com wrote: Replicating 40 cores every 20 seconds is just *asking* for trouble. How often do your cores change on the master? How big are they? Is there any chance you just have too many cores replicating at once? Best Erick On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: You can try to reuse your connections (prevent them from closing) by specifying -Dhttp.maxConnections=http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.htmlN in jvm startup params. At client JVM!. Number should be chosen considering the number of connection you'd like to keep alive. Let me know if it works for you. On Tue, Dec 13, 2011 at 2:57 PM, samarth s samarth.s.seksa...@gmail.comwrote: Hi, I am using solr replication and am experiencing a lot of connections in the state CLOSE_WAIT at the master solr server. These disappear after a while, but till then the master solr stops responding. There are about 130 open connections on the master server with the client as the slave m/c and all are in the state CLOSE_WAIT. Also, the client port specified on the master solr server netstat results is not visible in the netstat results on the client (slave solr) m/c. Following is my environment: - 40 cores in the master solr on m/c 1 - 40 cores in the slave solr on m/c 2 - The replication poll interval is 20 seconds. - Replication part in solrconfig.xml in the slave solr: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave !--fully qualified url for the replication handler of master-- str name=masterUrl$mastercorename/replication/str !--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically. But a fetchindex can be triggered from the admin or the http API-- str name=pollInterval00:00:20/str !-- The following values are used when the slave connects to the master to download the index files. Default values implicitly set as 5000ms and 1ms respectively. The user DOES NOT need to specify these unless the bandwidth is extremely low or if there is an extremely high latency-- str name=httpConnTimeout5000/str str name=httpReadTimeout1/str /lst /requestHandler Thanks for any pointers. -- Regards, Samarth -- Sincerely yours Mikhail Khludnev Developer Grid Dynamics tel. 1-415-738-8644 Skype: mkhludnev http://www.griddynamics.com mkhlud...@griddynamics.com -- Regards, Samarth
Re: Solr Open File Descriptors
Thanks for sharing your insights shawn On Mon, Oct 17, 2011 at 1:27 AM, Shawn Heisey s...@elyograg.org wrote: On 10/16/2011 12:01 PM, samarth s wrote: Hi, Is it safe to assume that with a megeFactor of 10 the open file descriptors required by solr would be around (1+ 10) * 10 = 110 ref: *http://onjava.com/pub/a/**onjava/2003/03/05/lucene.html#** indexing_speed*http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed* Solr wiki: http://wiki.apache.org/solr/**SolrPerformanceFactors#**Optimization_** Considerationsstateshttp://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates that FD's required per segment is around 7. Are these estimates appropriate. Does it in anyway depend on the size of the index number of docs (assuming same number of segments in any case) as well? My index has 10 files per normal segment (the usual 7 plus three more for termvectors). Some of the segments also have a .del file, and there is a segments_* file and a segments.gen file. Your servlet container and other parts of the OS will also have to open files. I have personally seen three levels of segment merging taking place at the same time on a slow filesystem during a full-import, along with new content coming in at the same time. With a mergefactor of 10, each merge is 11 segments - the ten that are being merged and the merged segment. If you have three going on at the same time, that's 33 segments, and you can have up to 10 more that are actively being built by ongoing index activity, so that's 43 potential segments. If your filesystem is REALLY slow, you might end up with even more segments as existing merges are paused for new ones to start, but if you run into that, you'll want to udpate your hardware, so I won't consider it. Multiplying 43 segments by 11 files per segment yields a working theoretical maximum of 473 files. Add in the segments files, you're up to 475. Most operating systems have a default FD limit that's at least 1024. If you only have one index (core) on your Solr server, Solr is the only thing running on that server, and it's using the default mergeFactor of 10, you should be fine with the default. If you are going to have more than one index on your Solr server (such as a build core and a live core), you plan to run other things on the server, or you want to increase your mergeFactor significantly, you might need to adjust the OS configuration to allow more file descriptors. Thanks, Shawn -- Regards, Samarth
Solr Open File Descriptors
Hi, Is it safe to assume that with a megeFactor of 10 the open file descriptors required by solr would be around (1+ 10) * 10 = 110 ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed* Solr wiki: http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates that FD's required per segment is around 7. Are these estimates appropriate. Does it in anyway depend on the size of the index number of docs (assuming same number of segments in any case) as well? -- Regards, Samarth
Field Cache
Hi, I have read lucene field cache is used in faceting and sorting. Is it also populated/used when only selected fields are retrieved using the 'fl' OR 'included fields in collapse' parameters? Is it also used for collapsing? -- Regards, Samarth
Re: exception obtaining write lock on startup
In that case why is there a separate lock factory of SingleInstanceLockFactory? On Fri, Dec 31, 2010 at 6:25 AM, Lance Norskog goks...@gmail.com wrote: This will not work. At all. You can only have one Solr core instance changing an index. On Thu, Dec 30, 2010 at 4:38 PM, Tri Nguyen tringuye...@yahoo.com wrote: Hi, I'm getting this exception when I have 2 cores as masters. Seems like one of the cores obtains a lock (file) and then the other tries to obtain the same one. However, the first one is not deleted. How do I fix this? Dec 30, 2010 4:34:48 PM org.apache.solr.handler.ReplicationHandler inform WARNING: Unable to get IndexCommit on startup org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: Native FSLock@..\webapps\solr\tnsolr\data\index\lucene-fe3fc928a4bbfeb55082e49b32a70c10 -write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:85) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1421) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:19 1) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHand ler.java:98) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHa ndler2.java:173) at org.apache.solr.update.DirectUpdateHandler2.forceOpenWriter(DirectUpd ateHandler2.java:376) at org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler. Tri -- Lance Norskog goks...@gmail.com
Collapsing with start, rows parameters
Hi, I am using collapsing with start rows parameters. For start=0 rows=10 my query looks like: q=f1:v1+AND+f2:v2date:[*+TO+*]rows=10start=0fl=rootIdcollapse.field=rootIdcollapse.threshold=1collapse.type=normalcollapse.includeCollapsedDocs.fl=id The same query with start=10, gives me an overlapping result. i.e. Last two of the first query's collapse groups are appearing in the second query's results as the first two groups. With increasing the value of the start parameter, the number of overlapping groups changes somewhat arbitrarily. Sometimes gives 3,5,etc. I am working on a patch for SOLR-236 dated 2010-06-17 03:08 PM. Has this been an issue that has been fixed? Thanks for any pointers, Samarth
Total number of groups after collapsing
Hi, I have been using collapsing in my application. I have a requirement of finding the no of groups matching some filter criteria. Something like a COUNT(DISTINCT columnName). The only solution I can currently think of is using the query: q=*:*rows=Integer.MAX_VALUEstart=0fl=scorecollapse.field=abccollapse.threshold=1collapse.type=normal I get the number of groups from 'numFound', but this seems like a bad solution in terms of performance. Is there a cleaner way? Thanks, Samarth
Re: Total number of groups after collapsing
Hi, I figured out a better way of doing it. The following query would be a better option: q=*:*start=2147483647rows=0collapse=truecollapse.field=abccollapse.threshold=1 Thanks, Samarth On Thu, Dec 23, 2010 at 8:57 PM, samarth s samarth.s.seksa...@gmail.comwrote: Hi, I have been using collapsing in my application. I have a requirement of finding the no of groups matching some filter criteria. Something like a COUNT(DISTINCT columnName). The only solution I can currently think of is using the query: q=*:*rows=Integer.MAX_VALUEstart=0fl=scorecollapse.field=abccollapse.threshold=1collapse.type=normal I get the number of groups from 'numFound', but this seems like a bad solution in terms of performance. Is there a cleaner way? Thanks, Samarth
Glob in fl parameter
Hi, Is there any support for glob in the 'fl' param. This would be very useful in case of retrieving dynamic fields. I have read the wiki for FieldAliasesAndGlobsInParams. Is there any related patch? Thanks for any pointers, Samarth
Re: solr dynamic core creation
Hi nizan, I have the same requirement of creating cores on the fly. Was looking for some API provided by http solr server. Currently working around by writing my own shell script on the server (solr server :) ). Any better leads on the same? Thanks, Samarth On Thu, Nov 11, 2010 at 9:27 PM, Robert Sandiford bob.sandif...@sirsidynix.com wrote: No - in reading what you just wrote, and what you originally wrote, I think the misunderstanding was mine, based on the architecture of my code. In my code, it is our 'server' level that does the SolrJ indexing calls, but you meant 'server' to be the Solr instance, and what you mean by 'client' is what I was thinking of (without thinking) as the 'server'... Sorry about that. Hopefully someone else can chime in on your specific issue... -- View this message in context: http://lucene.472066.n3.nabble.com/solr-dynamic-core-creation-tp1867705p1883354.html Sent from the Solr - User mailing list archive at Nabble.com.
Dynamically create new core
Hi, I have a requirement of dynamically creating new cores(master). Each core should have a replicated slave core. I am working with Java and using SolrJ as my solr client. I came across CoreAdminRequest class and looks like the way to go. CoreAdminRequest.createCore(NewCore1, NewCore1, solrServer); creates a new core programmatically. Also, for the newly created core, I want to use an existing solrconfig.xml modify certain parameters. Can I achieve this using SolrJ? Are there any better approaches for the requirement? Thanks for any pointers,