Companies Using Solr
Hey Folks, Reminder: http://wiki.apache.org/solr/PublicServers lists the sites using Solr. The listing is a bit thin. I know many people don't know about the list or don't have the time to add themselves to the list. I'd like to be able to promote open sourcing more systems (like Solr) and this information would help show it is helping a large community. Feel free to reply directly to me and I can add you. Thanks. --cw Clay Webster Associate VP, Platform Infrastructure CNET, Inc. (Nasdaq:CNET)
Re: Request for graphics
'k. see SOLR-368. --cw On 9/28/07, Yonik Seeley [EMAIL PROTECTED] wrote: On 9/28/07, Clay Webster [EMAIL PROTECTED] wrote: i'm late for dinner out, so i'm just attaching it here. Most attachments are stripped :-) -Yonik
Re: Any clever ideas to inject into solr? Without http?
Condensing the loader into a single executable sounds right if you have performance problems. ;-) You could also try adding multiple docs in a single post if you notice your problems are with tcp setup time, though if you're doing localhost connections that should be minimal. If you're already local to the solr server, you might check out the CSV slurper. http://wiki.apache.org/solr/UpdateCSV It's a little specialized. And then there's of course the question of are you doing full re-indexing or incremental indexing of changes? --cw On 8/9/07, Kevin Holmes [EMAIL PROTECTED] wrote: I inherited an existing (working) solr indexing script that runs like this: Python script queries the mysql DB then calls bash script Bash script performs a curl POST submit to solr We're injecting about 1000 records / minute (constantly), frequently pushing the edge of our CPU / RAM limitations. I'm in the process of building a Perl script to use DBI and lwp::simple::post that will perform this all from a single script (instead of 3). Two specific questions 1: Does anyone have a clever (or better) way to perform this process efficiently? 2: Is there a way to inject into solr without using POST / curl / http? Admittedly, I'm no solr expert - I'm starting from someone else's setup, trying to reverse-engineer my way out. Any input would be greatly appreciated.
Re: Any clever ideas to inject into solr? Without http?
If it's a contention between search and indexing, separate them via a query-slave and an index-master. --cw On 8/9/07, David Whalen [EMAIL PROTECTED] wrote: What we're looking for is a way to inject *without* using curl, or wget, or any other http-based communication. We'd like for the HTTP daemon to only handle search requests, not indexing requests on top of them. Plus, I have to believe there's a faster way to get documents into solr/lucene than using curl _ david whalen senior applications developer eNR Services, Inc. [EMAIL PROTECTED] 203-849-7240 -Original Message- From: Clay Webster [mailto:[EMAIL PROTECTED] Sent: Thursday, August 09, 2007 11:43 AM To: solr-user@lucene.apache.org Subject: Re: Any clever ideas to inject into solr? Without http? Condensing the loader into a single executable sounds right if you have performance problems. ;-) You could also try adding multiple docs in a single post if you notice your problems are with tcp setup time, though if you're doing localhost connections that should be minimal. If you're already local to the solr server, you might check out the CSV slurper. http://wiki.apache.org/solr/UpdateCSV It's a little specialized. And then there's of course the question of are you doing full re-indexing or incremental indexing of changes? --cw On 8/9/07, Kevin Holmes [EMAIL PROTECTED] wrote: I inherited an existing (working) solr indexing script that runs like this: Python script queries the mysql DB then calls bash script Bash script performs a curl POST submit to solr We're injecting about 1000 records / minute (constantly), frequently pushing the edge of our CPU / RAM limitations. I'm in the process of building a Perl script to use DBI and lwp::simple::post that will perform this all from a single script (instead of 3). Two specific questions 1: Does anyone have a clever (or better) way to perform this process efficiently? 2: Is there a way to inject into solr without using POST / curl / http? Admittedly, I'm no solr expert - I'm starting from someone else's setup, trying to reverse-engineer my way out. Any input would be greatly appreciated.
Re: To cluster, or not to cluster...
On 3/24/06, Robert Haycock [EMAIL PROTECTED] wrote: Is it/will it be possible to cluster solr? We have a distributed system and it would be nice if we could replicate the index to improve performance. Solr does not have replication. But it does have a very nice index distribution system. Solr can be run in a master/slave setup. The master receives all the changes. For each commit a snapshooter index can be made. The slaves can run the snappuller with whatever polling frequency they like. Each snapshot is then snapinstalled in the slave and can have its cache warmed (while serving queries from the older index). Slaves can come on line with new indexes out of sync. But if your slave hardware is the same and your pulling and shooting well-understood, and you make warming time-based it probably will not be a problem. This distribution is noted by each slave in the master. That's as tied together as they get (not much). So, if you have a requirement that they must all be in index-version-sync you could tie them closer and extend Solr. --cw