What is the best way to index xml data preserving the mark up?

2007-11-07 Thread David Neubert
I am sure this is 101 question, but I am bit confused about indexing xml data using SOLR. I have rich xml content (books) that need to searched at granular levels (specifically paragraph and sentence levels very accurately, no approximations). My source text has exact p/p and s/s tags for

Re: What is the best way to index xml data preserving the mark up?

2007-11-07 Thread David Neubert
). wunder On 11/7/07 8:18 PM, David Neubert [EMAIL PROTECTED] wrote: I am sure this is 101 question, but I am bit confused about indexing xml data using SOLR. I have rich xml content (books) that need to searched at granular levels (specifically paragraph and sentence levels very accurately

Re: AW: What is the best way to index xml data preserving the mark up?

2007-11-08 Thread David Neubert
Thanks -- C-Data might be useful -- and I was looking into dynamic fields as solution as well -- I think a combination of the two might work. - Original Message From: Hausherr, Jens [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Thursday, November 8, 2007 4:03:02 AM Subject:

Boolean matches in a unique instance of a multi-value field?

2007-11-08 Thread David Neubert
Is it possible to find boolean matches (foo AND bar) in a single unique instance of a multi-value field. So if foo is found in one instance of multi-value field, and is also found in another instance of the multi-value field -- this WOULD NOT be a match, but only if both words are found in

Re: AW: What is the best way to index xml data preserving the mark up?

2007-11-08 Thread David Neubert
Chris I'll try to track down your Jira issue. (2) sounds very helpful -- I am only 2 days old in SOLR/Lucene experience, but know what I need -- and basically its to search by the main granules in an xml document, with usually turn out to be for books book (rarley), chapter (more often),

Re: What is the best way to index xml data preserving the mark up?

2007-11-08 Thread David Neubert
(at least, storing Xpath information is one of the proposed uses: http://lucene.grantingersoll.com/2007/03/18/payloads/ ), as Erik Hatcher suggested in relation to https://issues.apache.org/jira/browse/SOLR-380 . Peter -Original Message- From: David Neubert [mailto:[EMAIL PROTECTED] Sent

Delte all docs in a SOLR index?

2007-11-09 Thread David Neubert
Sorry for another basic question -- but what is the best safe way to delete all docs in a SOLR index. I tried deleteid:*/delete -- and that didn't work, plus wasn't sure if it was safe -- when I put a real id in it works, but that is too tedious. I am in my first few days using SOLR and

Re: Delte all docs in a SOLR index?

2007-11-09 Thread David Neubert
Thanks! - Original Message From: Ryan McKinley [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, November 9, 2007 1:48:45 PM Subject: Re: Delte all docs in a SOLR index? I tried deleteid:*/delete try: deletequery*:*/query/delete ryan

Re: Delte all docs in a SOLR index?

2007-11-09 Thread David Neubert
Thanks! - Original Message From: Chris Hostetter [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, November 9, 2007 1:51:03 PM Subject: Re: Delte all docs in a SOLR index? : Sorry for another basic question -- but what is the best safe way to : delete all docs in a SOLR

Re: Delte all docs in a SOLR index?

2007-11-10 Thread David Neubert
is to stop Solr and remove the index directory. There is less chance of corruption, and it will faster. -Original Message- From: David Neubert [mailto:[EMAIL PROTECTED] Sent: Friday, November 09, 2007 10:56 AM To: solr-user@lucene.apache.org Subject: Re: Delte all docs in a SOLR index? Thanks

Re: Delte all docs in a SOLR index?

2007-11-10 Thread David Neubert
(like power loss). -Mike -Original Message- From: David Neubert [mailto:[EMAIL PROTECTED] Sent: Friday, November 09, 2007 10:56 AM To: solr-user@lucene.apache.org Subject: Re: Delte all docs in a SOLR index? Thanks! - Original Message From: Chris Hostetter [EMAIL PROTECTED

Redundant indexing * 4 only solution (for par/sen and case sensitivity)

2007-11-10 Thread David Neubert
Hi all, Using SOLR, I believe I have to index the same content 4 times (not desirable) into 2 indexes -- and I don't know how you can practically do multiple indexes in SOLR (if indeed there is no better solution than 4 indexing runs into two indexes? My need is case-sensitive and case

Re: Redundant indexing * 4 only solution (for par/sen and case sensitivity)

2007-11-10 Thread David Neubert
Ryan, Thanks for your response. I infer from your response that you can have a different analyzer for each field -- I guess I should have figured that out --but because I had not thought of that, I concluded that I needed multiple indices (sorry , I am still very new to Solr/Lucene). Does

Re: Redundant indexing * 4 only solution (for par/sen and case sensitivity)

2007-11-10 Thread David Neubert
Subject: Re: Redundant indexing * 4 only solution (for par/sen and case sensitivity) David Neubert wrote: Ryan, Thanks for your response. I infer from your response that you can have a different analyzer for each field yes! each field can have its own indexing strategy. I believe

Re: Redundant indexing * 4 only solution (for par/sen and case sensitivity)

2007-11-10 Thread David Neubert
indexing * 4 only solution (for par/sen and case sensitivity) On Nov 10, 2007 4:24 PM, David Neubert [EMAIL PROTECTED] wrote: So if I am hitting multiple fields (in the same search request) that invoke different Analyzers -- am I at a dead end, and have to result to consequetive multiple queries

Re: Redundant indexing * 4 only solution (for par/sen and case sensitivity)

2007-11-12 Thread David Neubert
own case field with the per-book setting attached at index time. Erik On Nov 11, 2007, at 12:55 AM, David Neubert wrote: Yonik (or anyone else) Do you know where on-line documentation on the +case: syntax is located? I can't seem to find it. Dave - Original Message

Re: Associating pronouns instances to proper nouns?

2007-11-12 Thread David Neubert
to render I, he, him, instead of the indexed acronym. - Original Message From: David Neubert [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, November 12, 2007 2:54:11 PM Subject: Associating pronouns instances to proper nouns? All, I am working with very exact text and search

Re: Redundant indexing * 4 only solution (for par/sen and case sensitivity)

2007-11-12 Thread David Neubert
-of-lucene perspective, so don't even *think* of asking me how to really make this work in SOLR G. Best Erick On Nov 11, 2007 12:44 AM, David Neubert [EMAIL PROTECTED] wrote: Ryan (and others who need something to put them so sleep :) ) Wow -- the light-bulb finally went off -- the Analzyer admin page

Re: Associating pronouns instances to proper nouns?

2007-11-12 Thread David Neubert
references including pronouns? Anybody see any holes in this? (sounds alarmingly easy so far)? Dave - Original Message From: David Neubert [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, November 12, 2007 3:04:20 PM Subject: Re: Associating pronouns instances to proper

LuceneInAction.zip?

2007-11-13 Thread David Neubert
I purchased Lucene In Action (really great book by the way, one of the best technical books (if not the best) that I can ever read. Its making me embarrassed about some of the questions I have already posted :) That said, here is another one -- I found LuceneInAction.zip on