Remove part of keywords from existing index and merging new index

2011-02-18 Thread Ryan Chan
Hello,

I am not sure if it is possible.

1. I have a document of 100MB, I want to remove keywords started with
a specific pattern, e.g. abc*, so all keywords started with abc* in
the index will be removed, and I don't need to reindex the document
again.

2. I have another document of 100KB, I want to append the new document
to an existing one, without the new to reindex the existing document
again.


I believe (2) is possible, but not sure about (1).

Thanks.


Re: TermVector query using Solr Tutorial

2011-02-09 Thread Ryan Chan
Hello,

On Tue, Feb 8, 2011 at 11:12 PM, Grant Ingersoll gsing...@apache.org wrote:

 It's a little hard to read due to the indentation, but AFAICT you have two 
 terms, usb and cabl.  USB appears at position 0 and cabl at position 1.  
 Those are the relative positions to each other.  Perhaps you can explain a 
 bit more what you are trying to do?

I am searching the keyword 25, in the field

field name=features30 TFT active matrix LCD, 2560 x 1600, .25mm
dot pitch, 700:1 contrast/field

I want to know the character position of matched keyword in the
corresponding field.

usb or cabl is not what I want.


TermVector query using Solr Tutorial

2011-02-05 Thread Ryan Chan
Hello all,

I am following this tutorial:
http://lucene.apache.org/solr/tutorial.html, I am playing with the
TermVector, here is my step:


1. Launch the example server, java -jar start.jar

2. Index the monitor.xml, java -jar post.jar monitor.xml, which
contains the following

adddoc
  field name=id3007WFP/field
  field name=nameDell Widescreen UltraSharp 3007WFP/field
  field name=manuDell, Inc./field
  field name=catelectronics/field
  field name=catmonitor/field
  field name=features30 TFT active matrix LCD, 2560 x 1600, .25mm
dot pitch, 700:1 contrast/field
  field name=includesUSB cable/field
  field name=weight401.6/field
  field name=price2199/field
  field name=popularity6/field
  field name=inStocktrue/field
/doc/add


3. Execute the query to search for 25, as you can see, there are two
`25` in the field features, i.e.
http://localhost/solr/select/?q=25version=2.2start=0rows=10indent=onqt=tvrhtv.all=true

4. The term vector in the result does not make sense to me


lst name=termVectors
-
lst name=doc-2
str name=uniqueKey3007WFP/str
-
lst name=includes
-
lst name=cabl
int name=tf1/int
-
lst name=offsets
int name=start4/int
int name=end9/int
/lst
-
lst name=positions
int name=position1/int
/lst
int name=df1/int
double name=tf-idf1.0/double
/lst
-
lst name=usb
int name=tf1/int
-
lst name=offsets
int name=start0/int
int name=end3/int
/lst
-
lst name=positions
int name=position0/int
/lst
int name=df1/int
double name=tf-idf1.0/double
/lst
/lst
/lst
str name=uniqueKeyFieldNameid/str
/lst

What I want to know is the relative position the keywords within a field.

Anyone can explain the above result to me?

Thanks.


Is it possible to get keyword/match's position?

2010-07-27 Thread Ryan Chan
According to SO:
http://stackoverflow.com/questions/1557616/retrieving-per-keyword-field-match-position-in-lucene-solr-possible

It is not possible, but it is one year ago, is it still true for now?

Thanks.


Getting the offset of search keyword in a document

2010-07-24 Thread Ryan Chan
Hello,

I am new to Solr/Lucene and I am evaluating if they suit my need and
replace our in-house system.


Our requirements:

1. I have multiple documents (1M)
2. Each document contains text ranged from few KB to a few MB
3. I want to search for a keyword, search thru all theses document,
and it return the matched document(s), AND ALSO the offset of that
'keyword' inside the document.

Is it possible for requirement 3?