date:20050310

Re: Obtaining the contexts of hits

2005-03-10 Thread Miles Barr

The highligher contrib package does what you're looking for:

http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/highlighter/

By default it breaks the document into chunks roughly 100 characters
long. You can alter it to get tens words either side of the matched
term.



-- 
Miles Barr <[EMAIL PROTECTED]>
Runtime Collective Ltd.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: SPAN QUERY [HOW TO]

2005-03-10 Thread Miles Barr

On Thu, 2005-03-10 at 12:02 +0530, Karthik N S wrote:
>  U got it bingo,Am trying to do something similar as u replied.
>  But there is a glitch in the  process
> 
>  If the search is done on the 'leaf_category'  as u said
> 
>  with word such as  'CAMERA DIGITAL'  instead of  'DIGITAL CAMERA'  the
> resultant
> 
>  return hits will be  ZERO '0'. Usage of SpanQuery  for such conditions
> applied should return still
> 
>  the 1st document of 3.
> 
>  A permutation combination of words entered should result in the specific
> document being returned.

If depends what the type of leaf_category is. If you made it Keyword as
I suggested then it won't be tokenized. i.e. there's one token 'DIGITAL
CAMERA' instead of the two tokens you normally get, 'digital' and
'camera'.

If you change the field type to Text you should be able to use a
SpanNearQuery to do your search.

-- 
Miles Barr <[EMAIL PROTECTED]>
Runtime Collective Ltd.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: identifier field as keyword or unindexed

2005-03-10 Thread javier muguruza

Thanks Erik,

I will investigate Filters and I'll see then.


On Wed, 9 Mar 2005 14:43:58 -0500, Erik Hatcher
<[EMAIL PROTECTED]> wrote:
> 
> On Mar 9, 2005, at 10:09 AM, javier muguruza wrote:
> > (I sent this to the old list, I dont know wether it reached the
> > list...just in case I repost it)
> >
> > Hi all,
> >
> > We index our documents in the following way:
> >
> > doc = new Document();
> > // mailid
> > doc.add(Field.UnIndexed("mid",mid));
> > //body
> > doc.add(Field.UnStored("body", textb));
> >
> > mid is a unique identifier, and body contains long pieces of text to
> > be indexed.
> >
> > And later make searches on the body field, the mid allows us to find a
> > file on the filesystem with a compressed (and digitally signed)
> > version of the original body indexed.
> > Our way to work in a query in our app is this:
> > 1. first we make a search in a db (for many different reasons) that
> > returns a number (from 0 to thousands) of mid
> > 2. we use lucene to search for some text in many indexes, this returns
> > a second list of mid
> > 3. we return the result as the intersection of both lists.
> >
> > This is working fine right now, but wonder wether we are not using
> > lucene to the fullest, cause we could also store mid as a keyword
> > (instead of unindexed), and add the condition (AND mid==[any mid from
> > our step 1]) to the lucene query we run. My questions are:
> >
> > 1. Is there a limit in the number of conditions I can add to a query??
> > Sometimes we have 10 mids, other times we have thousands of them so we
> > would have to add: AND (mid:mid1 OR mid:mid2 ... OR mid:mid1).
> > Probably there is a limit, and we could only apply the mid conditions
> > when the number or mids returned by step 1 is smaller than that limit?
> 
> BooleanQuery has a built-in limit of 1,024 clauses so it would only be
> useful when there is a small number of mids.  Consider using a Filter
> though.  There are some built-in ones, but maybe a custom one is best.
> 
> > 2. As the mid is a unique identifier (I guest lucene does not care
> > about that right?)
> 
> Right, Lucene doesn't care about field/term uniqueness.
> 
> > , and the condition on the mid woudl be ANDed to the
> > text query conditions, will it be faster for lucene to look first in
> > the mid field and dont do the text lookup if the mid condition is not
> > fullfilled? I dont know wether I am clear enough...Will I get some
> > benefit on the queries by adding some additional conditions or the
> > cost of adding another field to index will not pay off? Maybe it
> > depends on the number of documents? Maybe it would be best to set mid
> > as a keyword just in case, and add it as conditions later if the
> > searches take too long?
> 
> I doubt you'd even notice the difference.  There is little cost to
> adding the additional field, and looks like you'd benefit from having
> mid as a Keyword.
> 
> Also, with a Filter,  you could use it to bounce to your relational
> database to constrain results based on a set of mids.  Filters are
> designed to be used for multiple queries and cached - keep that in mind
> and maybe it'll work out well in your scenario.
> 
> Erik
> 
> 
> >
> > thanks for any though on that
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Find version of Lucene library

2005-03-10 Thread Bernhard Messer

+1
  
   Bernhard

Doug Cutting wrote:
Andrzej Bialecki wrote:
Hmmm... would not java.lang.Package various methods do the job?

I'm not sure... I just tried to do 
Package.getPackage("org.apache.lucene") and got null, even though the 
manifest is present in the JAR.

I looked into this.  The package name in the manifest is 
"org/apache/lucene".  But in order for this to work the package name 
in the manifest must: (a) end with a slash; and (b) name a package 
with classes in it, and lucene's classes are all in sub-packages--no 
classes are in "org.apache.lucene".

I've attached a patch that fixes (b) by adding a class in the 
top-level package.  Comments?

Doug

Index: build.xml
===
--- build.xml   (revision 156658)
+++ build.xml   (working copy)
@@ -189,7 +189,7 @@
  excludes="**/*.java">
  

-
+
  
  
  
Index: src/java/org/apache/lucene/package.html
===
--- src/java/org/apache/lucene/package.html (revision 0)
+++ src/java/org/apache/lucene/package.html (revision 0)
@@ -0,0 +1 @@
+Top-level package.
Index: src/java/org/apache/lucene/LucenePackage.java
===
--- src/java/org/apache/lucene/LucenePackage.java   (revision 0)
+++ src/java/org/apache/lucene/LucenePackage.java   (revision 0)
@@ -0,0 +1,28 @@
+package org.apache.lucene;
+
+/**
+ * Copyright 2005 The Apache Software Foundation
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/** Lucene's package information, including version. **/
+public final class LucenePackage {
+
+  private LucenePackage() {}  // can't construct
+
+  /** Return Lucene's package, including version information. */
+  public static Package get() {
+return LucenePackage.class.getPackage();
+  }
+}
 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: SPAN QUERY [HOW TO]

2005-03-10 Thread Karthik N S

Hi Guys

 Apologies...


  I ditto as u said but the SpanNearQuery is

  returning me all the 3 documents containing  for rollover of  words

 'DIGITAL CAMERAS' instead of returning me the 1st doc, Or none by changing
the slop factor

Any more ideas Please do .. B(

with regards
karthik


-Original Message-
From: Miles Barr [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 10, 2005 2:53 PM
To: java-user@lucene.apache.org
Subject: RE: SPAN QUERY [HOW TO]


On Thu, 2005-03-10 at 12:02 +0530, Karthik N S wrote:
>  U got it bingo,Am trying to do something similar as u replied.
>  But there is a glitch in the  process
>
>  If the search is done on the 'leaf_category'  as u said
>
>  with word such as  'CAMERA DIGITAL'  instead of  'DIGITAL CAMERA'  the
> resultant
>
>  return hits will be  ZERO '0'. Usage of SpanQuery  for such conditions
> applied should return still
>
>  the 1st document of 3.
>
>  A permutation combination of words entered should result in the specific
> document being returned.

If depends what the type of leaf_category is. If you made it Keyword as
I suggested then it won't be tokenized. i.e. there's one token 'DIGITAL
CAMERA' instead of the two tokens you normally get, 'digital' and
'camera'.

If you change the field type to Text you should be able to use a
SpanNearQuery to do your search.

--
Miles Barr <[EMAIL PROTECTED]>
Runtime Collective Ltd.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: SPAN QUERY [HOW TO]

2005-03-10 Thread Miles Barr

What fields do you have and what are you putting in them?



On Thu, 2005-03-10 at 17:56 +0530, Karthik N S wrote:
> Hi Guys
> 
>  Apologies...
> 
> 
>   I ditto as u said but the SpanNearQuery is
> 
>   returning me all the 3 documents containing  for rollover of  words
> 
>  'DIGITAL CAMERAS' instead of returning me the 1st doc, Or none by changing
> the slop factor
> 
> Any more ideas Please do .. B(
> 
> with regards
> karthik
> 
> 
> -Original Message-
> From: Miles Barr [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 10, 2005 2:53 PM
> To: java-user@lucene.apache.org
> Subject: RE: SPAN QUERY [HOW TO]
> 
> 
> On Thu, 2005-03-10 at 12:02 +0530, Karthik N S wrote:
> >  U got it bingo,Am trying to do something similar as u replied.
> >  But there is a glitch in the  process
> >
> >  If the search is done on the 'leaf_category'  as u said
> >
> >  with word such as  'CAMERA DIGITAL'  instead of  'DIGITAL CAMERA'  the
> > resultant
> >
> >  return hits will be  ZERO '0'. Usage of SpanQuery  for such conditions
> > applied should return still
> >
> >  the 1st document of 3.
> >
> >  A permutation combination of words entered should result in the specific
> > document being returned.
> 
> If depends what the type of leaf_category is. If you made it Keyword as
> I suggested then it won't be tokenized. i.e. there's one token 'DIGITAL
> CAMERA' instead of the two tokens you normally get, 'digital' and
> 'camera'.
> 
> If you change the field type to Text you should be able to use a
> SpanNearQuery to do your search.
> 
> --
> Miles Barr <[EMAIL PROTECTED]>
> Runtime Collective Ltd.
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
-- 
Miles Barr <[EMAIL PROTECTED]>
Runtime Collective Ltd.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

search performace

2005-03-10 Thread Michael Celona

I have a large index that needs to yield very fast query times.  I am
sorting by date as default since I am interested in the most recent
documents.  I was wondering if I boosted the score of my documents in
proportion to the date and not sorting would this increase search
performance. Thoughts?

 

Thanks,

Michael

problem for the adataption of a xml ranking model

2005-03-10 Thread Nicolas Maisonneuve

Hy,

I'm trying to adapte a xml ranking model for lucene. 

For the moment i'm just playing with the leaf node i.e. a node
containing data. For lucene, this node is a a search field and the idf
is replaced by ief (inversed element frequency)
ief= log (NumDoc_e)/(NumDoc_e_t+1) +1

NumDoc_e_t = number of doc with element e containing the term t.
NumDoc_e_t=docFreq(new Term(e, t))

NumDoc_e = number of documents with the element  el ( = number of
documents with some values in the search field el )

How do you compute the numDoc_e with the lucene API ?.

thanks,

nicolas maisonneuve

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Score Question

2005-03-10 Thread Luke Shannon

A couple of times.

Luke

- Original Message - 
From: "Erik Hatcher" <[EMAIL PROTECTED]>
To: 
Sent: Wednesday, March 09, 2005 8:03 PM
Subject: Re: Score Question


> Did you reindex after upgrading?
> 
> Erik
> 
> On Mar 9, 2005, at 5:55 PM, Luke Shannon wrote:
> 
> > Hi;
> >
> > Has the scoring changed recently? I just upgraded all the jars is our
> > application (Lucene included).
> >
> > I'm getting scores like this from documents in hits:
> >
> > 6.9699495E-4
> >
> > The XSL that creates the user interface converts the score to an int 
> > and
> > than display it. This currently resulting in a zero for all scores.
> >
> > I'm being told that the XSL has always done this and everything worked 
> > fine
> > before I did this upgrade.
> >
> > I made a few changes to the query parsing and searching logic.
> >
> > I'm thinking I have either:
> > 1. Done something to screw up the scores being returned (is this 
> > possible?)
> > 2. The XSL did do something with this value before converting it to an 
> > int
> > and somehow that code has been misplaced.
> > 3. Scoring has changed in the last version of Lucene and I need 
> > multilply
> > the score by some factor to make it more int friendly.
> >
> > Can someone shed some light on this please?
> >
> > Thanks,
> >
> > Luke
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Score Question

2005-03-10 Thread Luke Shannon

I think I've found my problem. In the example I'm having the problem with I
do a multiple field query.  I think I need to play with my boosting factors.

This is the section of the book that I think will lead to a resolution to my
problem:

In addition to the explicit factors in this equation, other factors can be
computed
on a per-query basis as part of the queryNorm factor. Queries themselves can
have an impact on the document score. Boosting a Query instance is sensible
only
in a multiple-clause query; if only a single term is used for searching,
boosting it
would boost all matched documents equally. In a multiple-clause boolean
query,
some documents may match one clause but not another, enabling the boost
factor
to discriminate between queries. Queries also default to a 1.0 boost factor.

Luke



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

help with query format design

2005-03-10 Thread Omar Didi

Hi folks,

I have a task when I need to read the query entered by the user, add to it many 
other terms in a boolean expression and get the count of each clause in a 
different field.
for examples: if the user enters: red. i need to take red and generate the 
following query(red AND blue) OR (red AND green) OR( red AND yellow).  i need 
to get a count of how many returned documents contain (red AND blue) in the 
"url"(field), and the count in "contnet" and so on(I have 5 different fields). 
i will have to do this for every clause in the query that i generate and 
populate the results in an xml document.
if i just have one long query, is there a way to get the count from the Hits 
that way i want, or shall i have multiple queries?
will termDocs and docFreq methods be helpful?.

thanks


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

what is the index compression technique in lucene ?

2005-03-10 Thread Nicolas Maisonneuve

hy, 

i would just know what is the index compression technique used in
lucene. where can i find some information about this ?

thanks in advance,
nicolas

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: what is the index compression technique in lucene ?

2005-03-10 Thread Tate Avery

This might be what you are looking for...
http://lucene.apache.org/java/docs/fileformats.html

-Original Message-
From: Nicolas Maisonneuve [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 10, 2005 12:21 PM
To: Lucene Users List
Subject: what is the index compression technique in lucene ?

hy, 

i would just know what is the index compression technique used in
lucene. where can i find some information about this ?

thanks in advance,
nicolas

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

highlighter and phrase search

2005-03-10 Thread Yura Smolsky

Hello, java-user.

I have two documents:
1. content:A V A B
2. content:A B C D
When I do search for content:"A B" (exact phrase search) and
StandardAnalyzer(),
when I use Highlighter I receive following highlighted results:

1. _A_ V _A B_
2. _A B_ C D

Actually "A" in the first result does not need to be highlighted b/c
requested keyword was "A B".

Is this a bug or I do not understand something?

Yura Smolsky.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: highlighter and phrase search

2005-03-10 Thread markharw00d

The short answer is "no",  there is not support for this currently.
Implementing this support is possible but fiddly- there is a related 
discussion here which outlines some of the challenges :
   http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg12435.html

Cheers,
Mark

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Boost/Scoring Question

2005-03-10 Thread Luke Shannon

Hello;

This may be a trivial questions, but it has me stuck.

I'm getting some really small scores:

8.799379E-4

I need to figure out why they are so small.

I think it is problem which can be resolved using boosting.

I'm not sure how to boost given the system I have. The fields I query
against are not known at the time I create the query, nor are they known
when I index.

Here is how I create a query (the names of the fields I query against are
returned from the ContentFields object):

private static Query parseKeywordsQuery(ArrayList queryData) throws
Exception {
//get all the content fields
TreeSet sFields = ContentFields.getInstance().getFields();
String[] fields = (String[])sFields.toArray(new
String[ContentFields.getInstance().getFields().size()]);
//get the argument
String arg = (String) queryData.get(1);
//create a query
Query query = MultiFieldQueryParser.parse(arg, fields, new
StandardAnalyzer());
return query;
}

Here is the query created for sub-brand:

TEXT:"sub brand" TEXT2:"sub brand" active:"sub brand" anouncement:"sub
brand" businessphone:"sub brand" cellphone:"sub brand" cname:"sub brand"
contents:"sub brand" definition:"sub brand" desc:"sub brand" email:"sub
brand" fax:"sub brand" help_image:"sub brand" homephone:"sub brand"
hotwords:"sub brand" id:"sub brand" jobtitle:"sub brand" kcfileupload:"sub
brand" kcpreviewupload:"sub brand" keywords:"sub brand" level:"sub brand"
level_association:"sub brand" mac_file:"sub brand" name:"sub brand"
note:"sub brand" pc_file:"sub brand" question:"sub brand" sort:"sub brand"
stylesheet:"sub brand" thumbnail:"sub brand" uncomp_ext:"sub brand"
urgent:"sub brand" weblink:"sub brand"

I get 22 results but they are all smaller than 0 by an exponent of 4.

Is there anything I can do to resolve this?

Luke



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Highlighter compile error

2005-03-10 Thread Andy Roberts

I've search the archives for this error, but it reported no matches...

I'm trying to get hold of the Highlighter code as this could be relevant 
to my earlier post. I've checked out the highlight repo to my PC and 
tried to build.

I get the following error:

$ ant
Buildfile: build.xml

init:
 [echo] Building highlighter

compile:
[javac] Compiling 17 source files 
to /home/andyr/programming/java/lucene/hig
hlighter/build/classes

[javac] /home/andyr/programming/java/lucene/highlighter/src/java/org/apache/
lucene/search/highlight/TokenSources.java:19: cannot find symbol
[javac] symbol  : class TermVectorOffsetInfo
[javac] location: package org.apache.lucene.index
[javac] import org.apache.lucene.index.TermVectorOffsetInfo;
[javac]^

[javac] /home/andyr/programming/java/lucene/highlighter/src/java/org/apache/
lucene/search/highlight/TokenSources.java:124: cannot find symbol
[javac] symbol  : class TermVectorOffsetInfo
[javac] location: class org.apache.lucene.search.highlight.TokenSources
[javac] TermVectorOffsetInfo[] offsets=tpv.getOffsets(t);
[javac] ^

[javac] /home/andyr/programming/java/lucene/highlighter/src/java/org/apache/
lucene/search/highlight/TokenSources.java:124: cannot find symbol
[javac] symbol  : method getOffsets(int)
[javac] location: interface org.apache.lucene.index.TermPositionVector
[javac] TermVectorOffsetInfo[] offsets=tpv.getOffsets(t);
[javac]   ^
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 3 errors

BUILD FAILED
/home/andyr/programming/java/lucene/common.xml:107: Compile failed; see the 
compiler error output for details.

It may not be obvious to those not using fixed-width fonts, but basically it 
can't find the 
TermVectorOffsetInfo class. Which is hardly surprising, since it doesn't seem 
to exist! I've
also downloaded and successfully built the code in the lucene-1.4.2-dev 
branch,
but that doesn't contain that class either!

Any hints? Google didn't shed any light, btw.

Cheers,
Andy Roberts

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

case insensitive searches

2005-03-10 Thread Srimant Mishra

Hi all, 

 

  I have a field that has been populated as a keyword e.g.
populated via doc.add(Field.Keyword("ID", "Xyz Abc"));

 

  Is it possible to perform a case insensitive query that is if
I do a search for xyz, the document is returned.

 

 

  I know that this is possible for UnStored fields as they are
stored in lower case formats. 

 

 

-Srimant

Test Fails

2005-03-10 Thread Hari Kodungallur

Hi,

FYI:

The Lucence test fails with the following error:

compile-test:
[mkdir] Created dir: /opt/lucene/lucene/build/classes/test
[javac] Compiling 79 source files to /opt/lucene/lucene/build/classes/test
[javac] 
/opt/lucene/lucene/src/test/org/apache/lucene/index/TermInfosTest.java:89:
cannot resolve symbol
[javac] symbol  : constructor TermInfosWriter
(org.apache.lucene.store.Directory,java.lang.String,org.apache.lucene.index.FieldInfos)
[javac] location: class org.apache.lucene.index.TermInfosWriter
[javac] TermInfosWriter writer = new TermInfosWriter(store,
"words", fis);


I see that TermInfosWriter was changed yesterday to add another
argument to its constructor.

(I am new to this list; could you please reply-all? Mails to
[EMAIL PROTECTED] are bouncing back and so I
could not subscribe to the list)

Thanks much!
-Hari

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: case insensitive searches

2005-03-10 Thread Otis Gospodnetic

What typically makes searches case insensitive are Analyzers that
lowercase/normalize tokens, perhaps with LowerCaseTokenizers.  Since
Field.Keyword doesn't get analyzed, you'd have to manually
normalize/lowercase field values before indexing, or just add the raw +
the normalized value under the same field name.  It looks like you'll
have to reindex.

Otis

--- Srimant Mishra <[EMAIL PROTECTED]> wrote:
> Hi all, 
> 
>  
> 
>   I have a field that has been populated as a keyword
> e.g.
> populated via doc.add(Field.Keyword("ID", "Xyz Abc"));
> 
>  
> 
>   Is it possible to perform a case insensitive query that
> is if
> I do a search for xyz, the document is returned.
> 
>  
> 
>  
> 
>   I know that this is possible for UnStored fields as
> they are
> stored in lower case formats. 
> 
>  
> 
>  
> 
> -Srimant
> 
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Highlighter compile error

2005-03-10 Thread Otis Gospodnetic

Andy,

Judging from your path, it looks like you didn't check things out of
SVN.

You'll need a SVN client, and then you could:

  svn co http://svn.apache.org/repos/asf/lucene/java/trunk/

In there you will see a contrib/ directory, and highlighter underneath
it.  Running ant from there builds the Highligher without errors for me
now.

Andy Roberts familiar name.  jTokeniser?  I just 'Simpied' you
earlier today: http://www.simpy.com/simpy/User.do?username=otis

Otis


--- Andy Roberts <[EMAIL PROTECTED]> wrote:
> I've search the archives for this error, but it reported no
> matches...
> 
> I'm trying to get hold of the Highlighter code as this could be
> relevant 
> to my earlier post. I've checked out the highlight repo to my PC and 
> tried to build.
> 
> I get the following error:
> 
> $ ant
> Buildfile: build.xml
> 
> init:
>  [echo] Building highlighter
> 
> compile:
> [javac] Compiling 17 source files 
> to /home/andyr/programming/java/lucene/hig
> hlighter/build/classes
> 
> [javac]
> /home/andyr/programming/java/lucene/highlighter/src/java/org/apache/
> lucene/search/highlight/TokenSources.java:19: cannot find symbol
> [javac] symbol  : class TermVectorOffsetInfo
> [javac] location: package org.apache.lucene.index
> [javac] import org.apache.lucene.index.TermVectorOffsetInfo;
> [javac]^
> 
> [javac]
> /home/andyr/programming/java/lucene/highlighter/src/java/org/apache/
> lucene/search/highlight/TokenSources.java:124: cannot find symbol
> [javac] symbol  : class TermVectorOffsetInfo
> [javac] location: class
> org.apache.lucene.search.highlight.TokenSources
> [javac] TermVectorOffsetInfo[]
> offsets=tpv.getOffsets(t);
> [javac] ^
> 
> [javac]
> /home/andyr/programming/java/lucene/highlighter/src/java/org/apache/
> lucene/search/highlight/TokenSources.java:124: cannot find symbol
> [javac] symbol  : method getOffsets(int)
> [javac] location: interface
> org.apache.lucene.index.TermPositionVector
> [javac] TermVectorOffsetInfo[]
> offsets=tpv.getOffsets(t);
> [javac]   ^
> [javac] Note: Some input files use unchecked or unsafe
> operations.
> [javac] Note: Recompile with -Xlint:unchecked for details.
> [javac] 3 errors
> 
> BUILD FAILED
> /home/andyr/programming/java/lucene/common.xml:107: Compile failed;
> see the 
> compiler error output for details.
> 
> It may not be obvious to those not using fixed-width fonts, but
> basically it 
> can't find the 
> TermVectorOffsetInfo class. Which is hardly surprising, since it
> doesn't seem 
> to exist! I've
> also downloaded and successfully built the code in the
> lucene-1.4.2-dev 
> branch,
> but that doesn't contain that class either!
> 
> Any hints? Google didn't shed any light, btw.
> 
> Cheers,
> Andy Roberts
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Obtaining the contexts of hits

RE: SPAN QUERY [HOW TO]

Re: identifier field as keyword or unindexed

Re: Find version of Lucene library

RE: SPAN QUERY [HOW TO]

RE: SPAN QUERY [HOW TO]

search performace

problem for the adataption of a xml ranking model

Re: Score Question

Re: Score Question

help with query format design

what is the index compression technique in lucene ?

RE: what is the index compression technique in lucene ?

highlighter and phrase search

Re: highlighter and phrase search

Boost/Scoring Question

Highlighter compile error

case insensitive searches

Test Fails

Re: case insensitive searches

Re: Highlighter compile error

21 matches

Site Navigation

Mail list logo

Footer information