positionIncrementGap - what is its value meaning?

2008-03-26 Thread Vinci
Hi all, While I changing the default schema.xml, I found this attribute where defined the analyzer...seems it will add some space when multiple fields appear in document, but what is its effect appear in query and what is the values mean here? Thank you, Vinci -- View this message in context:

Re: Highlighting Quoted Phrases

2008-03-26 Thread Vinci
Hi, Would it be easier if you turn off the highlighting while viewing full document (but summary highlighting is still available) and use javascript to do the matching? (As long as we are need highlighting only when looking at specific document in runtime) Thank you, Vinci Brian Whitman wrote:

RE: Update a field without reindexing the entire document?

2008-03-26 Thread Ard Schrijvers
Hello Otis, I have been looking for something similar for Jackrabbit's lucene index, but I still have some uncertainty about wether I understand correctly what the patches in SOLR-139 supply: Do they just retrieve formerly stored fields of a lucene Document, change some field, and then analyze

Re: Update a field without reindexing the entire document?

2008-03-26 Thread Vinci
Hi Otis, One question: If the target field is a multi-value field, what will be the consequence of the update for SOLR-139: overriding or appending? Thank you, Vinci Otis Gospodnetic wrote: Hi Galen, See SOLR-139 (this is from memory) issue in JIRA. Doable, but not in Solr nightlies

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread solr
Quoting Ryan McKinley [EMAIL PROTECTED]: In general, you need to be very careful when you change the schema without reindexing. Many changes will break all search, some may be just fine. for example, if you change sint to slong anything already indexed as an sint will be incompatible with the

Re: positionIncrementGap - what is its value meaning?

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 3:11 AM, Vinci wrote: While I changing the default schema.xml, I found this attribute where defined the analyzer...seems it will add some space when multiple fields appear in document, but what is its effect appear in query and what is the values mean here? Suppose

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Jeryl Cook
Top often requested feature: 1. Make the option on using the RAMDirectory to hook in Terracotta( billion(s) of items in an index anyone?..it would be possible using this.) 2. Make the schema.xml configurable at runtime, not really sure the best way to address this, because changing the schema

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread solr
Quoting Jeryl Cook [EMAIL PROTECTED]: 2. Make the schema.xml configurable at runtime, not really sure the best way to address this, because changing the schema would require re-indexing the documents. Isn't the best way to address this just to leave it to the persons that integrate solr

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread solr
Quoting Daniel Papasian [EMAIL PROTECTED]: [EMAIL PROTECTED] wrote: Quoting Jeryl Cook [EMAIL PROTECTED]: 2. Make the schema.xml configurable at runtime, not really sure the best way to address this, because changing the schema would require re-indexing the documents. Isn't the best way to

Solr commits automatically on appserver shutdown

2008-03-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi, If my appserver fails during an update or if I do a planned shutdown without wanting to commit my changes Solr does not allow it?. It commits whatever unfinished changes. Is it by design? Can I change this behavior? --Noble

Index corruption makes it return a different result

2008-03-26 Thread Lucas F. A. Teixeira
Hello all! I had a problem this week, and I like to share with you all. My weblogic server that generate my index hrows its logs in a shared storage. During my indexing process (SOLR+Lucene), this shared storage became 100% full, and everything collapsed (all servers that uses this shared

Re: Update a field without reindexing the entire document?

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 4:28 AM, Vinci wrote: One question: If the target field is a multi-value field, what will be the consequence of the update for SOLR-139: overriding or appending? You can specify when you update a field how that works. SOLR-139, though, seems a long way from being

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Dietrich
I understand that, and that makes sense. But, coming back to the orginal question: When performing searches, I need to be able to search against any combination of sites. Does anybody have suggestions what the best practice for a scenario like that would be, considering both

Term frequency

2008-03-26 Thread Tim Mahy
Hi All, is there a way to get the term frequency per found result back from Solr ? Greetings, Tim Info Support - http://www.infosupport.com Alle informatie in dit e-mailbericht is onder voorbehoud. Info Support is op geen enkele wijze aansprakelijk voor vergissingen of onjuistheden in dit

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Ryan McKinley
Jeryl Cook wrote: Top often requested feature: 1. Make the option on using the RAMDirectory to hook in Terracotta( billion(s) of items in an index anyone?..it would be possible using this.) This is noted in: https://issues.apache.org/jira/browse/SOLR-465 Out of cueriosity, any sense of

Replication of Segmented indexes

2008-03-26 Thread oleg_gnatovskiy
Hello, this is actually a repost of a question posed by Swarag. I don't think he made the question quite clear, so let me give it a shot. It is known that Solr has support for index replication, and it has support for index segmentation. The question is, how would you use the replication tools

Making stop-words optional with DisMax?

2008-03-26 Thread Ronald K. Braun
I've followed the stop-word discussion with some interest, but I've yet to find a solution that completely satisfies our needs. I was wondering if anyone could suggest some other options to try short of a custom handler or building our own queries (DisMax does such a fine job generally!). We are

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Daniel Papasian
[EMAIL PROTECTED] wrote: Quoting Daniel Papasian [EMAIL PROTECTED]: Or if you're adding a new field to the schema (perhaps the most common need for editing schema.xml), you don't need to reindex any documents at all, right? Unless I'm missing something? Well, it all depends on if that field

Re: Solr commits automatically on appserver shutdown

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 10:18 AM, Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED] wrote: If my appserver fails during an update or if I do a planned shutdown without wanting to commit my changes Solr does not allow it?. It commits whatever unfinished changes. Is it by design? Can I change

Re: Replication of Segmented indexes

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 11:34 AM, oleg_gnatovskiy [EMAIL PROTECTED] wrote: Hello, this is actually a repost of a question posed by Swarag. I don't think he made the question quite clear, so let me give it a shot. It is known that Solr has support for index replication, and it has support for

Search fail if copyField absent?(+ Jetty Question)

2008-03-26 Thread Vinci
Hi, While I am testing the Solr schema (1.3 nightly) with example mySolr on jetty, for the exampledocs and the default schema, I see the declaration: field name=features type=text indexed=true stored=true multiValued=true/ it should be indexed, so I comment this copyField source=features

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Otis Gospodnetic
Dietrich, I don't think there are established practices in the open (yet). You could design your application with a site(s)-shard mapping and then, knowing which sites are involved in the query, search only the relevant shards. This will be efficient, but it would require careful management

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Otis Gospodnetic
Hey Ryan, why do you say a Lucene/Solr index served via Terracotta would be substantially slower? I often wanted to try Terracotta + Lucene, but... time. Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ryan McKinley [EMAIL PROTECTED]

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Dietrich
Makes sense, nut probably overkill for my requirements. I wasn't really talking 275*20, more likely the total would be something like four million documents. I was under the assumption that a single machine, or a simple distributed index, should be able to handle that, is that wrong? -ds On

Re: How to index multiple sites with option of combining results in search

2008-03-26 Thread Otis Gospodnetic
Ah, that's a very different number. Yes, assuming your docs are web pages, a single reasonably equipped machine should be able to handle that and a few dozen QPS. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Dietrich [EMAIL PROTECTED] To:

Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Chris Harris
What are the odds that I can plop an index created in Solr 1.2 into a Solr 1.3 and/or Solr trunk install and have things work correctly? This would be more convenient than reindexing, but I'm wondering how dangerous it is, and hence how much testing is required.

Re: Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 3:05 PM, Chris Harris [EMAIL PROTECTED] wrote: What are the odds that I can plop an index created in Solr 1.2 into a Solr 1.3 and/or Solr trunk install and have things work correctly? Should be relatively high. I'd never do it on a live index, regardless of what is

Re: Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Ryan McKinley
It *should* work as a drop in replacement. Check: http://svn.apache.org/repos/asf/lucene/solr/trunk/CHANGES.txt So you should be good. Note that trunk has a newer verison of lucene, so the index will be automatically upgraded and you can't go back from there. so make sure to backup before

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Ryan McKinley
just intuition - haven't tried it, so i'd love to be proved wrong. Instrumenting Objects and magically passing them around seems like it would be slower then a tuned approach used in SOLR-303. It looks like they have a lucene example:

Re: Are 1.2 and 1.3/trunk indexes compatible?

2008-03-26 Thread Ryan McKinley
good point: http://svn.apache.org/viewvc/lucene/solr/trunk/CHANGES.txt?r1=641573r2=641572pathrev=641573 ryan Chris Harris wrote: Looks like that can't-go-back bit hasn't made it into CHANGES.txt yet. Might want to eventually add that somewhere particularly obvious, to help out people who

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Yonik Seeley
On Wed, Mar 26, 2008 at 4:41 PM, Ryan McKinley [EMAIL PROTECTED] wrote: just intuition - haven't tried it, so i'd love to be proved wrong. Instrumenting Objects and magically passing them around seems like it would be slower then a tuned approach used in SOLR-303. Yep, that's my sense too.

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Jeryl Cook
i wouldn't call Terracotta approach magic(smile)..., it's being used quite a bit in many scalable high performing projects... i personally used Terracotta and Lucene, and it worked but did not try to cluster it with multiple terracotta(workers) across nodes , and the Terracotta(master)..just a

Re: Search fail if copyField absent?(+ Jetty Question)

2008-03-26 Thread Chris Hostetter
: it should be indexed, so I comment this : copyField source=features dest=text/ : : However, the search fail. After I clear up the index and, uncomment the : copyField and commit the document again, the search work again. : : That I feeling very confusing as wiki and the schema.xml said this

Using Field Collapsing and Filter Query to implement JOIN

2008-03-26 Thread Lester Scofield
Hello solr people, I'm very new to solr to please forgive any misunderstanding on my part. I am hoping to do a JOIN across documents. Let me start with the 4 documents: doc field name=typepart1/field field name=keyABC/field field name=foothis is a test/field /doc doc field

Re: Update schema.xml without restarting Solr?

2008-03-26 Thread Chris Hostetter
: Top often requested feature: : 1. Make the option on using the RAMDirectory to hook in Terracotta( : billion(s) of items in an index anyone?..it would be possible using : this.) : : This is noted in: https://issues.apache.org/jira/browse/SOLR-465 ...and if people posted comments in the

Re: Document Path issue and change the layout in the example

2008-03-26 Thread Chris Hostetter
: I started the indexing with jetty and then I come with some question... : 1. If I use the example start.jar, what should be my document system layout? : What is the essential folder? : solr_jar : |_start.jar : |_solrhome : |_etc : |_lib : |_logs i'm not sure what solr_jar is ... but most of

Re: Highlighting Quoted Phrases

2008-03-26 Thread Chris Harris
On Tue, Mar 25, 2008 at 4:25 PM, Brian Whitman [EMAIL PROTECTED] wrote: On Mar 25, 2008, at 6:31 PM, Chris Harris wrote: working pretty well, but my testers have discovered something they find borderline unacceptable. If they search for stock market (with quotes), then

Re: Beginner questions: Jetty and solr with utf-8 + cached page + dedup

2008-03-26 Thread Thorsten Scherler
On Tue, 2008-03-25 at 10:56 -0700, Vinci wrote: Hi, Thank for your reply. Question for apply xslt: If I use saxon, where should the saxon.jar located if I using the example jetty server? lib/ inside example/ or outside the example/? http://wiki.apache.org/solr/mySolr ... Typically it's not

Re: Making stop-words optional with DisMax?

2008-03-26 Thread Ronald K. Braun
Hi Otis, I skimmed your email. You are indexing book and music titles. Those tend to be short. Do you really benefit from removing stop words in the first place? I'd try keeping all the stop words and seeing if that has any negative side-effects in your context. Thanks for your skim

Re: Adding custom field for sorting?

2008-03-26 Thread Vinci
Hi hossman, Thank you for your reply. Some question on sorting: 1. Does Solr have a limit, e.g a % or a number to limit the number of document involved in sorting? or just sort all document? 2. Does the order in parameter 'sort' refer to the sorting order? (sort the first argument first, then

Re: Search fail if copyField absent?(+ Jetty Question)

2008-03-26 Thread Vinci
Hi hossman, Thank you for your reply, it help a lots...just little more question here: hossman wrote: : it should be indexed, so I comment this : copyField source=features dest=text/ : : However, the search fail. After I clear up the index and, uncomment the : copyField and commit

Re: positionIncrementGap - what is its value meaning?

2008-03-26 Thread Vinci
Hi Erik, Thank you for your help. This is useful. Some follow up questions, Erik Hatcher wrote: ..The value you set that gap to depends on whether you'll be using sloppy phrase queries, and how sloppy they'll be and whether you desire matching across field instances. 1. If I

Re: Facet searching and facet hierarchies.

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 3:34 PM, A.Z wrote: As I understand, after passing facets to Solr, one must manually add facet results to search to narrow the search. ex. i search for foo bar and click some facet. must i now search for 'foo bar facet:value' ? Must I include + signs? I'm using

Re: positionIncrementGap - what is its value meaning?

2008-03-26 Thread Erik Hatcher
On Mar 26, 2008, at 10:15 PM, Vinci wrote: Erik Hatcher wrote: ..The value you set that gap to depends on whether you'll be using sloppy phrase queries, and how sloppy they'll be and whether you desire matching across field instances. 1. If I doesn't care the sloppy queries, I just set a

RE: How to index multiple sites with option of combining results in search

2008-03-26 Thread Lance Norskog
In fact, 55m records works fine in Solr; assuming they are small records. The problem is that the index files wind up in the tens of gigabytes. The logistics of doing backups, snapping to query servers, etc. is what makes this index unwieldy, and why multiple shards are useful. Lance

Re: Making stop-words optional with DisMax?

2008-03-26 Thread Walter Underwood
We use two fields, one with and one without stopwords. The exact field has a higher boost than the other. That works pretty well. It helps to have an automated relevance test when tuning the boost (and other things). I extracted queries and clicks from the logs for a couple of months. Not

Re: Solr commits automatically on appserver shutdown

2008-03-26 Thread Noble Paul നോബിള്‍ नोब्ळ्
Can I make an API call to remove the stale indexsearcher so that the documents do not get committed? Basically what I need is a 'rollback' feature --Noble On Wed, Mar 26, 2008 at 9:08 PM, Yonik Seeley [EMAIL PROTECTED] wrote: On Wed, Mar 26, 2008 at 10:18 AM, Noble Paul നോബിള്‍ नोब्ळ्

Re: Solr commits automatically on appserver shutdown

2008-03-26 Thread Yonik Seeley
On Thu, Mar 27, 2008 at 12:11 AM, Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED] wrote: Can I make an API call to remove the stale indexsearcher so that the documents do not get committed? Basically what I need is a 'rollback' feature This should be possible when Solr starts using Lucene's