Re: Solr - rudimentary problems

2007-09-17 Thread Chris Hostetter
:  The corresponding entry for this field in schema.xml is :
:  field name=id type=text indexed=true
: stored=true multiValued=false  required=true/

i'm guessing text is from the example schema.xml ... this is not a good 
type to use for a uniqueId field ... that alone might be causing some of 
your problems with replaceing docs ...  try string

: 2) Also, at the time of deleting a document, by providing its ID(exactly
: similar to the deleteById proc in the Embedded Solr example) , we find that
: the document is not getting deleted(and we also do not get any errors).

sounds like the same problem ... i'm guessing you are using a method that 
assumes the id has already been transformed into the internal 
representation ... with text that might be lowercased, or stemmed, 
etc





-Hoss



Re: Solr - rudimentary problems

2007-09-17 Thread Venkatraman S
C'est Parfait! .. yes - that was the problem.
thanks a lot.

I am compiling a complete list of FAQs  - will update it in the wiki soon.

-vEnKAt

On 9/18/07, Chris Hostetter [EMAIL PROTECTED] wrote:

 :  The corresponding entry for this field in schema.xml is :
 :  field name=id type=text indexed=true
 : stored=true multiValued=false  required=true/

 i'm guessing text is from the example schema.xml ... this is not a good
 type to use for a uniqueId field ... that alone might be causing some of
 your problems with replaceing docs ...  try string

 : 2) Also, at the time of deleting a document, by providing its ID(exactly
 : similar to the deleteById proc in the Embedded Solr example) , we find
 that
 : the document is not getting deleted(and we also do not get any errors).

 sounds like the same problem ... i'm guessing you are using a method that
 assumes the id has already been transformed into the internal
 representation ... with text that might be lowercased, or stemmed,
 etc





 -Hoss




--


Solr - rudimentary problems

2007-09-16 Thread Venkatraman S
We are using Lucene and are migrating to Solr 1.2 (we are using Embedded
Solr). During this process we are stumbling on certain problems :

1) IF the same document is added again, then it it getting added in the
index again(duplicated); inspite of the fact that the IDs are unique across
documents. This document should be updated in the Index.
 The corresponding entry for this field in schema.xml is :
 field name=id type=text indexed=true
stored=true multiValued=false  required=true/

2) Also, at the time of deleting a document, by providing its ID(exactly
similar to the deleteById proc in the Embedded Solr example) , we find that
the document is not getting deleted(and we also do not get any errors).

3) While using facets, we are getting the stemmed versions of the
corresponding words in the faceted fields - how do we get the 'original'
word?
As in, 'intenti' for 'intentional' etc

As i am new to Solr and did not find any documentation/on JIRA , i have
posted these. Any help would be highly appreciated.

-Venkat

--


RE: Solr - rudimentary problems

2007-09-16 Thread Stu Hood
With regards to #3, it is recommended that for faceting, you use a separate 
copy of the field with stemming/tokenizing disabled. See : 
http://wiki.apache.org/solr/SolrFacetingOverview#head-fc68926c8421055de872acc694a6a966fab705d6

Thanks,
Stu


-Original Message-
From: Venkatraman S 
Sent: Monday, September 17, 2007 1:05am
To: solr-user@lucene.apache.org
Subject: Solr - rudimentary problems

We are using Lucene and are migrating to Solr 1.2 (we are using Embedded
Solr). During this process we are stumbling on certain problems :

1) IF the same document is added again, then it it getting added in the
index again(duplicated); inspite of the fact that the IDs are unique across
documents. This document should be updated in the Index.
 The corresponding entry for this field in schema.xml is :
 
stored=true multiValued=false  required=true/

2) Also, at the time of deleting a document, by providing its ID(exactly
similar to the deleteById proc in the Embedded Solr example) , we find that
the document is not getting deleted(and we also do not get any errors).

3) While using facets, we are getting the stemmed versions of the
corresponding words in the faceted fields - how do we get the 'original'
word?
As in, 'intenti' for 'intentional' etc

As i am new to Solr and did not find any documentation/on JIRA , i have
posted these. Any help would be highly appreciated.

-Venkat

--


Re: Solr - rudimentary problems

2007-09-16 Thread Ryan McKinley

Venkatraman S wrote:

We are using Lucene and are migrating to Solr 1.2 (we are using Embedded
Solr). During this process we are stumbling on certain problems :

1) IF the same document is added again, then it it getting added in the
index again(duplicated); inspite of the fact that the IDs are unique across
documents. This document should be updated in the Index.
 The corresponding entry for this field in schema.xml is :
 field name=id type=text indexed=true
stored=true multiValued=false  required=true/



Do you have:
uniqueKeyid/uniqueKey



2) Also, at the time of deleting a document, by providing its ID(exactly
similar to the deleteById proc in the Embedded Solr example) , we find that
the document is not getting deleted(and we also do not get any errors).



are you calling commit/?



3) While using facets, we are getting the stemmed versions of the
corresponding words in the faceted fields - how do we get the 'original'
word?
As in, 'intenti' for 'intentional' etc



Faceting works on the indexed terms - if the field has stemming applied, 
the facets will be stemmed.


If you need to have stemming in some cases and the direct string in 
other cases, you can use copyField ...




Re: Solr - rudimentary problems

2007-09-16 Thread Venkatraman S
Kindly Note again : we are using Embedded Solr.

On 9/17/07, Ryan McKinley [EMAIL PROTECTED] wrote:

 Venkatraman S wrote:
  We are using Lucene and are migrating to Solr 1.2 (we are using Embedded
  Solr). During this process we are stumbling on certain problems :
 
  1) IF the same document is added again, then it it getting added in the
  index again(duplicated); inspite of the fact that the IDs are unique
 across
  documents. This document should be updated in the Index.
   The corresponding entry for this field in schema.xml is :
   field name=id type=text indexed=true
  stored=true multiValued=false  required=true/
 

 Do you have:
 uniqueKeyid/uniqueKey


yes - i am using it

 2) Also, at the time of deleting a document, by providing its ID(exactly
  similar to the deleteById proc in the Embedded Solr example) , we find
 that
  the document is not getting deleted(and we also do not get any errors).
 

 are you calling commit/?


Yes - exactly similar to the code mentioned in the embedded solr example in
the wiki http://wiki.apache.org/solr/EmbeddedSolr.

 3) While using facets, we are getting the stemmed versions of the
  corresponding words in the faceted fields - how do we get the 'original'
  word?
  As in, 'intenti' for 'intentional' etc
 

 Faceting works on the indexed terms - if the field has stemming applied,
 the facets will be stemmed.

 If you need to have stemming in some cases and the direct string in
 other cases, you can use copyField ...


Yea -got this. i rather commented the
 filter class=solr.EnglishPorterFilterFactory/ in a

--