Re: solr performance

2007-02-20 Thread Erik Hatcher
You could build your index using Lucene directly and then point a Solr instance at it once its built. My suspicion is that the overhead of forming a document as an XML string and posting to Solr via HTTP won't be that much different than indexing with Lucene directly. My largest Solr

Re: Re[2]: solr performance

2007-02-20 Thread Erik Hatcher
On Feb 20, 2007, at 1:46 PM, Jack L wrote: The numbers vary quite a bit though, from 13 docs/s (Burkamp) to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend on the doc size and hardware. my number 1000 was per minute, not second! however, i've done a few runs

Re: Starting an index...

2007-02-21 Thread Erik Hatcher
On Feb 21, 2007, at 4:37 PM, Jack L wrote: 2. For each index, do I need to copy this directory and start a solr instance? Is it possible to run one solr instance for multiple indices? Further on this than Hoss mentioned... you can share a common configuration among multiple Solr instances

Re: Re[4]: solr performance

2007-02-21 Thread Erik Hatcher
On Feb 21, 2007, at 4:25 PM, Jack L wrote: couple of times today at around 158 documents / sec. This is not bad at all. How about search performance? How many concurrent queries have people been having? What does the response time look like? I'm the only user :) What I've done is a

Re: Re[2]: Starting an index...

2007-02-21 Thread Erik Hatcher
On Feb 21, 2007, at 9:29 PM, Jack L wrote: Thanks Chris and Eric for the replies. Very helpful. no, each instance manages a single schema and a single data index -- but thta schema can allow for various differnt types of documents that don't need to have anything in common. Does this

Re: Tagging

2007-02-23 Thread Erik Hatcher
On Feb 22, 2007, at 11:30 PM, Gmail Account wrote: I use solr for searching and facets and love it.. The performance is awesome. However I am about to add tagging to my application and I'm having a hard time deciding if I should just database my tags for now until a better solr solution

Re: WordNet Ontologies

2007-02-23 Thread Erik Hatcher
On Feb 23, 2007, at 5:33 PM, rubdabadub wrote: Does Solr supports ontology somehow? Has it been tried? Any tips on how should I go about doing so? What are you wanting to do exactly? Erik

Re: index browsing with solr

2007-02-24 Thread Erik Hatcher
On Feb 24, 2007, at 3:36 AM, Pierre-Yves LANDRON wrote: it will be easy to add. take a look at a simple SolrRequestHandler: http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/ apache/solr/handler/IndexInfoRequestHandler.java this gets the IndexReader and writes out some stuff.

Re: index browsing with solr

2007-02-24 Thread Erik Hatcher
On Feb 24, 2007, at 6:26 AM, Erik Hatcher wrote: On Feb 24, 2007, at 3:36 AM, Pierre-Yves LANDRON wrote: it will be easy to add. take a look at a simple SolrRequestHandler: http://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/ apache/solr/handler/IndexInfoRequestHandler.java

Re: Re[2]: Solr and Multiple Index Partitions

2007-03-08 Thread Erik Hatcher
On Mar 7, 2007, at 9:20 PM, Jack L wrote: Selecting by type will do the job. But I suppose it sacrifice performance because having multiple document types in the same index will render a larger index. Is it bad? A many documents we talking here? My hunch is you'll be fine :) Erik

Re: Hierarchical Facets

2007-03-09 Thread Erik Hatcher
On Mar 8, 2007, at 10:52 PM, Chris Hostetter wrote: or something like... level1Dir1/level1 level2Subdir1/level2 level3SubSubDir1/level3 ...but this is why Hierarchical facets are hard. I've not yet tackled hierarchical facets myself despite the demand being there. It seems there are

Re: production solr - app server choice ?

2007-03-10 Thread Erik Hatcher
On Mar 9, 2007, at 6:46 AM, rubdabadub wrote: On 3/9/07, Erik Hatcher [EMAIL PROTECTED] wrote: We use jetty on a few applications with no problem. I recommend it unless and until you outgrow it (but I doubt you will). Resin, in my past experience with it, is fantastic. But no need to even

Re: Restrict Servlet Access

2007-03-14 Thread Erik Hatcher
On Mar 14, 2007, at 11:09 AM, Brian Whitman wrote: The recommendation is to firewall off Solr so only your application server can access it. Solr is not at all designed for direct client (browser, etc) access. Assuming you lock down update properly, what's the problem? We are

Re: About field-specific analyzer

2007-03-16 Thread Erik Hatcher
On Mar 16, 2007, at 5:17 AM, shjiang wrote: I don't understand how solr make field-specific analysis possible .In the source code ,they didn't use the PerFieldAnalyzerWrapper class.Can any one tell me something about that? It's configured through schema.xml.Solr has a fairly

Re: Bug ? unique id

2007-03-16 Thread Erik Hatcher
Why in the world would you want to analyze your unique id? Erik On Mar 16, 2007, at 6:07 AM, [EMAIL PROTECTED] wrote: Hello, we have been using Solr for a month now and we are running into a lot of trouble . one of the issues is a problem with the unique id field. can this

Re: cache sizes

2007-03-16 Thread Erik Hatcher
On Mar 16, 2007, at 2:21 PM, Andrew Nagy wrote: Is their a science to choosing a cache sizes? I have about 500,000 records and am seeing a lot of evictions, about 50% of lookups. What factors can i look at to determine what my cache sizes should be? Roughly you could start with getting a

Re: faceted result

2007-03-16 Thread Erik Hatcher
You've got your field set to be analyzed, and its using a stemmer. Chances are you don't intend to analyze the fields you're faceting on (and if you are doing that intentionally, performance caveats apply). Check that the field type is string and re-index. Erik On Mar 16, 2007,

Re: Simple Web Interface

2007-03-20 Thread Erik Hatcher
On Mar 20, 2007, at 9:36 AM, thomas arni wrote: Thanks for you hint. I looked at the features of Flare. I'm wondering if there is only a user interface for Rails. It looks like Flare mostly focus on faceted browsing. Faceted browsing is not my first priority. I'm developing a full-text

Re: SV: Re: Simple Web Interface

2007-03-20 Thread Erik Hatcher
On Mar 20, 2007, at 5:18 PM, Antonio Eggberg wrote: Erik Hatcher [EMAIL PROTECTED] skrev: Faceting only appears in Flare when there are *_facet fields in your index. Flare is going to undergo another spurt of evolution over the next couple of weeks as I tease it apart into a Rails plugin

Re: Wildcards

2007-03-21 Thread Erik Hatcher
Lucene now supports *456* type queries, however it requires setting an attribute to allow leading wildcards on the QueryParser. Solr does not set this flag (that I can tell in my quick search) so I don't believe you can do this with Solr currently, until/unless an option is made to set

Re: return matched terms / fuzzy or wildcard searches

2007-03-23 Thread Erik Hatcher
On Mar 23, 2007, at 3:26 PM, Yonik Seeley wrote: On 3/23/07, Mike Klaas [EMAIL PROTECTED] wrote: On 3/23/07, Chris Hostetter [EMAIL PROTECTED] wrote: : But the response isn't highlighted using fuzzy or wildcard searches... Hmmm... this seems like a bug in the highlighting, using the

Re: Perform a search with query containing

2007-03-26 Thread Erik Hatcher
On Mar 26, 2007, at 7:52 AM, Thierry Collogne wrote: Hello, I have a field sitename that can contain a word with character, HR O. Problem is when I do the following query : sitename:HR O, I get search results that don't have HR O in the sitename field. Is it possible that there is

Re: C# API for Solr

2007-03-31 Thread Erik Hatcher
It would be great to have solr-ruby (the library formerly known as solrb) included with Solr distributions, as well as Flare too. It would give these libraries visibility and usability they'd not see if they required additional downloads or svn co. I can certainly say that solr-ruby does

embedding solr

2007-04-02 Thread Erik Hatcher
I have a client need to embed Solr behind an already built custom TCP/ IP interface (currently for Lucene, but want to swap in Solr to benefit from its additional goodness of course). Have folks already done this? Experiences? Or perhaps there are some thoughts on why this may or may

Re: embedding solr

2007-04-02 Thread Erik Hatcher
with or without HTTP -- I know thats not what it was intended for, but solr makes lucene so much more manageable even without a server! On 4/2/07, Erik Hatcher [EMAIL PROTECTED] wrote: I have a client need to embed Solr behind an already built custom TCP/ IP interface (currently for Lucene

Re: Access filterCache/queryResultCache/documentCache

2007-04-04 Thread Erik Hatcher
On Apr 4, 2007, at 7:28 PM, Ryan McKinley wrote: Is there / should there be a way to access the three core caches? there should. +1 I want to be able to programmatic check the cache sizes and make sure they are big enough for faceting. i could use the same thing! Erik

Re: Solr logo poll

2007-04-06 Thread Erik Hatcher
A On Apr 6, 2007, at 1:51 PM, Yonik Seeley wrote: Quick poll... Solr 2.1 release planning is underway, and a new logo may be a part of that. What form of logo do you prefer, A or B? There may be further tweaks to these pictures, but I'd like to get a sense of what the user community likes.

Re: Solr web service available?

2007-04-11 Thread Erik Hatcher
On Apr 11, 2007, at 2:25 AM, alartin wrote: I wonder is there a solr web service available? or I have to use tools like Apache httpClient to send requests and get responses? Many thanks. There currently is no SOAP interface to Solr, if that is what you mean. However, many consider data

Re: Python utilities for solr?

2007-04-15 Thread Erik Hatcher
There is a solr.py in the Solr clients directory: http://svn.apache.org/repos/asf/lucene/solr/trunk/client/python/solr.py It's got some utility methods for generating field's. Erik On Apr 15, 2007, at 6:47 PM, Jack L wrote: Doing queries is so easy with Python, thanks to

Re: Leading wildcards

2007-04-19 Thread Erik Hatcher
On Apr 19, 2007, at 6:56 AM, Michael Kimsal wrote: It's bugged us a little bit, because it's something that we need (to be able to emulate the previous foo LIKE '%bar%' SQL behaviour we're replacing), but can't offer our users yet. I have also run into this issue and have intended to fix

Re: Facet Browsing

2007-04-19 Thread Erik Hatcher
On Apr 19, 2007, at 9:32 AM, Jennifer Seaman wrote: Can anyone provide a quick tutorial on how to setup facet browsing? After a keyword search I just want to allow the user to narrow the results by category, then by state, then by city and then by company. Some sample code would be

Re: Leading wildcards

2007-04-19 Thread Erik Hatcher
On Apr 19, 2007, at 10:39 AM, Yonik Seeley wrote: On 4/19/07, Erik Hatcher [EMAIL PROTECTED] wrote: parser.setAllowLeadingWildcards(true); I have also run into this issue and have intended to fix up Solr to allow configuring that switch on QueryParser. Any reason

Re: Leading wildcards

2007-04-19 Thread Erik Hatcher
On Apr 19, 2007, at 11:04 AM, Michael Kimsal wrote: Perhaps I'm simplifying it a bit. It would certainly help out our comfort level to have it either be on or configurable by default, rather than having to maintain a 'patched' version (yes, the patch is only one line, but it's the

Re: Leading wildcards

2007-04-19 Thread Erik Hatcher
On Apr 19, 2007, at 11:37 AM, Michael Kimsal wrote: It's not that I don't *want* to contribute, but hardly have enough time to get the basics done some days. You can rest assured that all of us here are in that same boat. :) And you can also rest assured that the switch your asking for

Re: Multiple indexes?

2007-04-19 Thread Erik Hatcher
Matthew, All that is meant by object_types is an additional stored/indexed field in the Solr schema that gets added to every document providing context of which type it is (shoes or brands). Then you can limit searches to a particular area by just filtering on type:shoes, for example.

Re: Solr performance warnings

2007-04-19 Thread Erik Hatcher
On Apr 19, 2007, at 7:47 PM, Michael Thessel wrote: in my logs I get from time to time this message: INFO: PERFORMANCE WARNING: Overlapping onDeckSearchers=2 What does this mean? What can I do to avoid this? I think you have issued multiple commits (or optimizes) that hadn't fully

Re: [acts_as_solr] Few question on usage

2007-04-19 Thread Erik Hatcher
Sorry, I missed the original mail. Hoss has got it right. Personally I'd love to see acts_as_solr definitively come into the solr-ruby fold. Regarding your questions: : 1. What are other alternatives are available for ruby integration with solr : other than acts-as_solr plugin.

Re: Facet.query

2007-04-20 Thread Erik Hatcher
'={ 'status'=0, 'QTime'=105, 'params'={ 'wt'='ruby', 'rows'='0', 'facet.query'=['ant', 'lucene'], 'facet'='on', 'indent'='on', 'q'='erik hatcher'}}, 'response'={'numFound'=3,'start'=0,'docs'=[] }, 'facet_counts'={ 'facet_queries'={ 'ant

Re: Avoiding caching of special filter queries

2007-04-20 Thread Erik Hatcher
On Apr 20, 2007, at 7:11 AM, Burkamp, Christian wrote: I'm using filter queries to implement document level security with solr. The caching mechanism for filters separate from queries comes in handy and the system performs well once all the filters for the users of the system are stored in

Re: [acts_as_solr] Few question on usage

2007-04-21 Thread Erik Hatcher
On Apr 20, 2007, at 2:30 PM, solruser wrote: For pure Ruby access to Solr without a database, use solr-ruby. The 0.01 gem is available as gem install solr-ruby, but if you can I'd recommend you tinker with the trunk codebase too. Well I say, considering use of solr with rails application.

Re: [acts_as_solr] Few question on usage

2007-04-21 Thread Erik Hatcher
On Apr 21, 2007, at 9:42 PM, Erik Hatcher wrote: source = DataSource.new mapping = { :id = :isbn, :name = :author, :source = BOOKS, :year = Proc.new {|record| record.date[0,4] }, } Solr::Indexer.index(source, mapper) do |orig_data, solr_document| solr_document[:timestamp] = Time.now

Re: expressing this logic

2007-04-25 Thread Erik Hatcher
What it probably boils down to is how you analyzed (or didn't) those fields. What is your schema for those fields? Erik On Apr 25, 2007, at 4:40 PM, Michael Kimsal wrote: leading and trailing at the same time don't work. :( This is supposedly fixed in a lucene nightly, but I

Re: case sensitivity

2007-04-26 Thread Erik Hatcher
On Apr 26, 2007, at 5:43 PM, Michael Kimsal wrote: I've looked through the mailing lists and can't find much of anything regarding case sensitivity. It seems SOLR is case sensitive by default - I'm using the default settings with a very basic schema - just text fields. All depends on the

Re: case sensitivity

2007-04-26 Thread Erik Hatcher
On Apr 26, 2007, at 6:03 PM, Michael Kimsal wrote: My colleague, after some digging, found in SolrQueryParser (around line 62) setLowercaseExpandedTerms(false); The default for Lucene is true. Was this intentional? Or an oversight? I was just about to respond that this is likely the

Re: numFound for facet results

2007-04-30 Thread Erik Hatcher
On Apr 30, 2007, at 11:16 AM, Yonik Seeley wrote: On 4/30/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: 2. I would like to be able to tell how many facet values are there total. (This would be a value like numFound for the results). Is there such a thing or a workaround like for 1.

Re: Delete from Solr index...

2007-05-01 Thread Erik Hatcher
If you want to do this as a single delete-by-query, you could OR all the clauses together: deletequeryload_id:(20070424150841 OR 20070425145301 )/ query/delete Erik On May 1, 2007, at 2:14 AM, Ryan McKinley wrote: escher2k wrote: I am trying to remove documents from my index

Re: Ranking ApacheCon proposals

2007-05-01 Thread Erik Hatcher
On May 1, 2007, at 7:42 PM, ericp wrote: Cool, I noticed a ruby-Flare-Solr presentation too who is giving that? I proposed that one. Erik

Re: Re[3]: Multiple fq fields in URL

2007-05-13 Thread Erik Hatcher
Jack, On May 13, 2007, at 6:45 PM, Jack L wrote: 1. I didn't understand the part above in your reply. If I search for samsung camera, the query should be like this in the select URL: q=samsung+camera And if samsung is mandatory, the query will be like this: (or not:) q=+samsung+camera

Re: TEI indexing

2007-05-22 Thread Erik Hatcher
On May 21, 2007, at 10:52 PM, Gary Browne wrote: I'm wondering if anyone has any hints on how to prepare TEI documents for indexing - I was about to write some XSLT but didn't want to reinvent the wheel (unless it's punctured)? I'm using Ruby to index TEI files, and leveraging the XPathMapper

Re: Interesting Practical Solr Question

2007-05-22 Thread Erik Hatcher
On May 22, 2007, at 9:58 AM, [EMAIL PROTECTED] wrote: I use Solr to search through a set of about 200,000 documents. Each document has a numeric ID. How to do the following: 1) I use facets and want to return the facets for all documents as the starting point of the user interface. In

Re: Interesting Practical Solr Question

2007-05-22 Thread Erik Hatcher
On May 22, 2007, at 10:07 AM, Will Johnson wrote: 2) Each document will be shown to the user with a check box next to it. I want to user to be able to select certain documents and save their ids some where else. This is not a problem. However, I also want to give the user an ability to say

Re: Interesting Practical Solr Question

2007-05-22 Thread Erik Hatcher
On May 22, 2007, at 11:31 AM, Martin Grotzke wrote: You need to specify the constrants (facet.query or facet.field params) Too bad, so we would have either to know the schema in the application or provide queries for index metadata / the schema / faceting info. However, the

Re: Interesting Practical Solr Question

2007-05-22 Thread Erik Hatcher
On May 22, 2007, at 1:36 PM, Martin Grotzke wrote: For sure, perhaps the schema field element could be extended by an attribute isfacet There is no effective difference between a facet field and any other indexed field. What fields are facets is application specific and not really

Re: compile error with SOLR 69 MoreLikeThis patch

2007-05-24 Thread Erik Hatcher
Andrew, Nightlies are available here: http://people.apache.org/builds/lucene/ solr/nightly/ (a link exists on the wiki main page, for future reference). Erik On May 24, 2007, at 2:28 PM, Andrew Nagy wrote: While I am on this topic, I think it might be nice to have a nightly

Re: add and delete docs at same time

2007-05-24 Thread Erik Hatcher
On May 24, 2007, at 3:47 PM, Ryan McKinley wrote: currently no. Right now you even need a new request for each delete... Unless you used delete-by-query with the id's OR'd deletequeryid:1 OR id:2 OR id:3/query/delete Patrick Givisiez wrote: can I add and delete docs at same

Re: AW: Re[2]: add and delete docs at same time

2007-05-25 Thread Erik Hatcher
Just to be clear, [* TO *] does not necessarily return all documents. It returns all documents that have a value in the specified (or default) field. Be careful with that! *:*, however, does match all documents. Erik On May 25, 2007, at 5:49 AM, Burkamp, Christian wrote:

Re: Indexing a lot of documents?

2007-06-01 Thread Erik Hatcher
On Jun 1, 2007, at 10:47 PM, Mike Klaas wrote: Am I just doing something wrong? No. Lucene sometimes just requires many file descriptors (this will be somewhat alleviated with Solr 1.2). I suggest upping the open file limit (I upped mine from 1024 to 45000 to handle huge indices).

Re: SOLVED Re: custom writer, working but... a strange exception in logs

2007-06-06 Thread Erik Hatcher
On Jun 6, 2007, at 5:32 PM, Chris Hostetter wrote: : It's the favicon.ico effect. : Nothing in logs when the class is resquested from curl, but with a : browser (here Opera), begin a response with html, and it requests for : favicon.ico. HA HA HA HA that's freaking hilarious. One

Re: storing the document URI in the index

2007-06-12 Thread Erik Hatcher
On Jun 12, 2007, at 8:51 AM, Ard Schrijvers wrote: is it possible to configure solr to store the document URI in the lucene index (the URI is not an xml field, but just the document's location)? Yes. Set the field to be store and non-indexed, field type string is what I use. Or is

Re: Can we change structure of add command?

2007-06-13 Thread Erik Hatcher
Vika - no, Solr's add-document XML syntax is not flexible in the way you've described. Solr fronts a Lucene index. A Lucene index is made up of Documents which have Fields. Fields are a flat structure, not hierarchical. The trick to leveraging Solr and Lucene successfully is in the

Fwd: Call for Papers Opens for OS Summit Asia 2007

2007-06-15 Thread Erik Hatcher
Begin forwarded message: From: J Aaron Farr [EMAIL PROTECTED] Call for Papers Opens for OS Summit Asia 2007 The call for papers is now open for OS Summit Asia, to be held November 26-30 at the Cyberport in Hong Kong. This joint conference between the Apache Software Foundation and the

Re: How to do a fuzzy query

2007-06-24 Thread Erik Hatcher
On Jun 23, 2007, at 11:24 PM, Jack L wrote: I have some documents, each has a number of tags. I'd like to have a query to return similar documents which share largest number of tags with a given document. For example, if I have doc that has 4 tags, and I'd like to return docs that also have

Re: Re[2]: How to do a fuzzy query

2007-06-25 Thread Erik Hatcher
On Jun 25, 2007, at 3:43 PM, Jack L wrote: MoreLikeThis is interesting. So in order to use it through the MoreLikeThisHandler, I should use the unique field in the q param to uniquely identify the this document? Or, does it also support a more common query and works as More Like These just like

Re: delete by query multiple Ids

2007-06-26 Thread Erik Hatcher
On Jun 26, 2007, at 6:46 AM, michael ravits wrote: hello solrs is it possible to query multiple specific ids? something like this: deletequerymediaId:6720,6721,6722,8762,9754/query/delete sure, but you need to use proper query parser syntax: mediaId:(6720 OR 6721 OR )

Re: differen locations for config files and Data files if using Java System Properties

2007-07-05 Thread Erik Hatcher
On Jul 4, 2007, at 12:00 PM, Ryan McKinley wrote: in solrconf.xml I found this entry, which is now uncomented dataDir${solr.data.dir:./solr/data}/dataDir before it was !-- dataDir./solr/data/dataDir -- Don't know if this is the desired behaviour. How should I change the entry

Re: Same record belonging to multiple facets

2007-07-05 Thread Erik Hatcher
Yup, it's that simple! :) Erik On Jul 5, 2007, at 5:42 PM, Thiago Jackiw wrote: Is it that simple? Cool, I'll give it a try. -- Thiago Jackiw On 7/5/07, Martin Grotzke [EMAIL PROTECTED] wrote: On Thu, 2007-07-05 at 12:39 -0700, Thiago Jackiw wrote: Is there a way for a record

Re: Solr and Chines/Japenese

2007-07-27 Thread Erik Hatcher
On Jul 27, 2007, at 6:17 AM, Erik Hatcher wrote: On Jul 26, 2007, at 10:26 PM, Sundling, Paul wrote: Are there any known Solr sites that are in Chinese or Japenese? This might be the first mention of this project in the Solr community, and I'm certainly not confident our server can

Re: Solr and Chines/Japenese

2007-07-27 Thread Erik Hatcher
On Jul 26, 2007, at 10:26 PM, Sundling, Paul wrote: Are there any known Solr sites that are in Chinese or Japenese? This might be the first mention of this project in the Solr community, and I'm certainly not confident our server can handle the load but here goes anyway :)

embedded solr: write lock issue

2007-08-06 Thread Erik Hatcher
I'm working on a project that embeds Solr, much like the EmbeddedSolr example posted here http://wiki.apache.org/solr/EmbeddedSolr. The application generally runs fine, with very rapid handling of indexing and search requests, however at heavy load we're experiencing Lock obtain timed out:

Re: embedded solr: write lock issue

2007-08-06 Thread Erik Hatcher
Thanks Mike and Yonik! I've upgraded the project to trunk Solr, added in the SingleInstanceLockFactory setting and bumped the write lock timeout. I personally haven't duplicated the issue (all works fine on my development box) but the client will give it a try in their test environment

Re: Best use of wildcard searches

2007-08-10 Thread Erik Hatcher
On Aug 9, 2007, at 4:49 PM, Yonik Seeley wrote: lo - these things can happen when you get too many levels of escaping needed. Hopefully we can improve the situation in the future to get rid of the query parser escaping for certain queries such as prefix and term. +1 :) this is

Re: how to retrieve all the documents in an index?

2007-08-20 Thread Erik Hatcher
Yes - they come back in the order indexed. Erik On Aug 19, 2007, at 7:20 PM, Yu-Hui Jin wrote: BTW, Hoss, is there a default order for the documents returned by running this query? thanks, -Hui On 8/16/07, Chris Hostetter [EMAIL PROTECTED] wrote: : Any of you know whether

Re: Embedded solr - reload searcher

2007-08-21 Thread Erik Hatcher
For other Solr instances (whether embedded or not) to refresh their index searchers, send a commit/ message to them. Erik On Aug 21, 2007, at 7:33 AM, sinking wrote: Hello, I have tried to use the EmbeddedSolr (http://wiki.apache.org/solr/ EmbeddedSolr) because i want to work

Re: Replacing existing documents

2007-08-22 Thread Erik Hatcher
On Aug 21, 2007, at 9:25 PM, Lance Norskog wrote: Recently someone mentioned that it would be possible to have a 'replace existing document' feature rather than just dropping and adding documents with the same unique id. There is such a patch:

Re: Embedded about 50% faster for indexing

2007-08-26 Thread Erik Hatcher
On Aug 24, 2007, at 5:29 PM, Wu, Daniel wrote: Theoretically and practically, embedded solution will be faster than going through http/xml. I would like to see solr has some sort of document source adapter architecture which will iterate through all the documents available in the document

Re: range index

2007-08-27 Thread Erik Hatcher
On Aug 27, 2007, at 9:32 AM, Jae Joo wrote: Is there any way to catagorize by price range? I would like to do facet by price range. (ex. 100-200, 201-500, 501-1000, ...) Yes, look at using facet queries using range queries. There is an example of this very thing here:

Re: range index

2007-08-27 Thread Erik Hatcher
that, but no real reason to with Solr's caching making the range buckets fast at query time. Could you elaborate on what you are trying to do? Erik Thanks, Jae On 8/27/07, Erik Hatcher [EMAIL PROTECTED] wrote: On Aug 27, 2007, at 9:32 AM, Jae Joo wrote: Is there any way

Re: Heap size error during indexing

2007-09-01 Thread Erik Hatcher
On Sep 1, 2007, at 9:28 AM, Jae Joo wrote: I have a Java Heap size problem during indexing for 13 millions doc. under linux using post.sh (optimized). each document size is about 2k. Is there any way to set java heap size in post.sh under tomcat? post.sh is a Solr *client*. Your heap size

Re: performance questions

2007-09-01 Thread Erik Hatcher
or the javax extension methods? What about the new release of jython? Erik On 8/30/07 6:57 PM, Erik Hatcher [EMAIL PROTECTED] wrote: On Aug 30, 2007, at 6:31 PM, Mike Klaas wrote: Another reason why people use stored procs is to prevent multiple round-trips in a multi-stage query

Re: updates on the server

2007-09-03 Thread Erik Hatcher
On Sep 3, 2007, at 12:22 AM, James O'Rourke wrote: Is there a way to pass the solr server a set of documents without all the fields present and only update the fields that are provided leaving the remaining document fields intact or do I need to pull those documents over the wire myself

Re: Multiple Values -Structured?

2007-09-04 Thread Erik Hatcher
multiValued fields retain their order, for the record. Erik On Sep 4, 2007, at 12:37 AM, Jed Reynolds wrote: One of the difficulties that you're going to find with multi-valued fields is that they are an unordered collection without relation. If you have a document with a list of

Re: Embedded SOLR using the SOLR collection distribution

2007-09-05 Thread Erik Hatcher
On Sep 5, 2007, at 3:30 AM, Dilip.TS wrote: I would like to know if can implement the Embedded SOLR using the SOLR collection distribution? Partly... the rsync method of getting a master index to the slaves would work, but you'd need a way to commit/ to the slaves so that they reload

Re: Can't get 1.2 running under Tomcat 5.5

2007-09-05 Thread Erik Hatcher
I guess my warning is more because I play on the edge and have several times ended up tweaking various apps solrconfig.xml's as I upgraded them to keep things working. Anyway, we'll all agree that diff'ing your config files with the example app can be useful. Erik On Sep 5,

Re: Tagging using SOLR

2007-09-06 Thread Erik Hatcher
On Sep 6, 2007, at 3:29 AM, Doss wrote: We are running an appalication built using SOLR, now we are trying to build a tagging system using the existing SOLR indexed field called tag_keywords, this field has different keywords seperated by comma, please give suggestions on how can we build

Re: updates on the server

2007-09-06 Thread Erik Hatcher
On Sep 6, 2007, at 2:56 PM, Matthew Runo wrote: On a related note, it'd be great if we could set up a series of transformations to be done on data when it comes into the index, before being indexed. I guess a custom tokenizer might be the best way to do this though..? ie: -Post -Data is

Re: Tagging using SOLR

2007-09-07 Thread Erik Hatcher
, Mohandoss On 9/6/07, Erik Hatcher [EMAIL PROTECTED] wrote: On Sep 6, 2007, at 3:29 AM, Doss wrote: We are running an appalication built using SOLR, now we are trying to build a tagging system using the existing SOLR indexed field called tag_keywords, this field has different keywords

Lucene/Solr OnTheRoad

2007-09-07 Thread Erik Hatcher
I just added brief mentions of some upcoming Lucene/Solr-related events to this page: http://wiki.apache.org/lucene-java/OnTheRoad Below is some self-promotion of an upcoming class I have agreed to teach. It's uncomfortable to send this sort of thing out, but if I don't then you might

Re: Space costs of dynamic fields?

2007-09-09 Thread Erik Hatcher
On Sep 7, 2007, at 4:32 PM, Lance Norskog wrote: Otherwise, the only downside of dynamic fields is that you can't say, give me fields a*_t but not b*_t in a query. I haven't found others in the mail archives or the wiki. We're gonna fix this one. It's part of this issue for now:

Re: New user question: How to show all stored fields in a result

2007-09-10 Thread Erik Hatcher
On Sep 10, 2007, at 3:07 PM, Mike Klaas wrote: On 10-Sep-07, at 11:54 AM, melkink wrote: The other change I made (which may or may not have contributed to the solution) was to remove all line breaks from the text being submitted to the doctext field. The line breaks were causing solr to

Re: Using Ruby to POST to Solr

2007-09-11 Thread Erik Hatcher
Matt, Try this instead: gem install solr-ruby # ;) Then in irb or wherever: solr = Solr::Connection.new(http://localhost:8983/solr;) solr.add(:id = 123, :title = insert title here) solr.commit solr.query(title) Visit us over on the [EMAIL PROTECTED] e-mail list for more on

Re: New user question: How to show all stored fields in a result

2007-09-11 Thread Erik Hatcher
On Sep 11, 2007, at 5:13 PM, melkink wrote: Erik Hatcher wrote: melkink - are you using solr-ruby? If so, that bug has been fixed in later versions ;) Erik Erik, Indeed I am! Thanks for letting me know that there's a new version available that fixes this bug. Like I

Re: Facets Fields not Indexed?

2007-09-13 Thread Erik Hatcher
On Sep 13, 2007, at 12:50 PM, Matthew Runo wrote: Can you not facet on fields which are not indexed? Am I missing something here? No. Faceting works off of terms, which are either the exact field value for unanalyzed fields, or the tokens that result from the configured analyzer.

Re: Synchronize large number of records with Solr

2007-09-14 Thread Erik Hatcher
Cuong, I accomplished (in Collex) by attaching a batch number to each document. When indexing a batch (or source), a GUID is generated and every document from that batch/source gets that same identifier attached to it. At the end of the indexing run, I delete everything with that

Re: hl.snippets per field overide

2007-09-14 Thread Erik Hatcher
On Sep 14, 2007, at 12:33 PM, Nathaniel E. Powell wrote: http://wiki.apache.org/solr/ HighlightingParameters#head-23ecd5061bc2c86a 561f85dc1303979fe614b956 where it talks about the hl.snippets parameter, it says that it can be overridden on a per-field basis. I haven't been able to find any

Re: Batch indexing a large number of records

2007-09-14 Thread Erik Hatcher
On Sep 14, 2007, at 8:19 AM, Thompson,Roger wrote: I am embarking on re-engineering an application using Solr/Lucene (If you'd like to see the current manifestation go to: fictionfinder.oclc.org). The database for this application consists of approximatly 1.4 million records of varying size

Re: Searching items with in the search results with SOLR

2007-09-18 Thread Erik Hatcher
On Sep 18, 2007, at 2:45 AM, Dilip.TS wrote: Is it possible to Search items with in the search results using SOLR. If so how? Simply AND the previous query to the new query, or use the previous query as a filter query (fq=...) parameter. Erik

Re: Search for Java Programming vs Java Programming

2007-09-18 Thread Erik Hatcher
On Sep 18, 2007, at 7:14 AM, Dilip.TS wrote: Hi, I have the following requirement: When the user searches for the keyword say Java Programming , the user should be shown the results satisfying the condition Java AND Programming. But when he types Java Programming (i.e within double

Re: What is facet?

2007-09-27 Thread Erik Hatcher
On Sep 26, 2007, at 7:28 PM, Chris Hostetter wrote: cool = (popularity:[100 TO *] (+numFeatures:[10 TO *] +price:[0 TO 10])) lame = (+popularity:[* TO 99] +numFeatures:[* TO 9] +price:[11 TO *]) That example is definitely in the cool category. I couldn't resist creating a

Re: custom sorting

2007-09-27 Thread Erik Hatcher
On Sep 27, 2007, at 2:50 PM, Chris Hostetter wrote: to answer the broader question of using customized LUcene SortComparatorSource objects in solr -- it is in fact possible. In Solr, all decisisons about how to sort are driven by FieldTypes. You can subclass any of the FieldTypes that come

Re: Does Solr Have?

2007-10-04 Thread Erik Hatcher
On Oct 4, 2007, at 4:38 AM, Robert Young wrote: 1. Is there a REST interface for getting index stats? I would particularly like access to terms and their document frequencies, prefereably filtered by a query. Yes, the Luke request handler provides deeper views into the index information:

<    1   2   3   4   5   6   7   8   9   10   >