on big indexes.
* p
* Default: false.
*/
@Override
public void setAllowLeadingWildcard(boolean allowLeadingWildcard) {
this.allowLeadingWildcard = allowLeadingWildcard;
}
And the default is false (leading wildcard not allowed.)
-- Jack Krupansky
-Original Message-
From: Carlos de Luna Saenz
Out of curiosity, what is your use case? I mean, the normal use of this
filter is to permit a shorthand reference to a long term, but why would
you necessarily want to preclude direct reference to the full term?
-- Jack Krupansky
-Original Message-
From: Alex Parvulescu
Sent
It sounds like you either need to have a custom analyzer or a field-aware
analyzer.
-- Jack Krupansky
-Original Message-
From: Scott Smith
Sent: Tuesday, September 17, 2013 4:26 PM
To: java-user@lucene.apache.org
Subject: Can you escape characters you don't want the analyzer
The trailing asterisk in your query input is escaped with a backslash, so
the query parser will not treat it as a wildcard.
-- Jack Krupansky
-Original Message-
From: Ankit Murarka
Sent: Thursday, September 12, 2013 10:19 AM
To: java-user@lucene.apache.org
Subject: Query type always
You're not escaping white space, so your input will be a sequence of terms,
which should generate a BooleanQuery. What is the last clause of the BQ? It
should be your PrefixQuery.
-- Jack Krupansky
-Original Message-
From: Ankit Murarka
Sent: Thursday, September 12, 2013 11:25 AM
want, then you need to escape it.
-- Jack Krupansky
-Original Message-
From: Ankit Murarka
Sent: Thursday, September 12, 2013 11:36 AM
To: java-user@lucene.apache.org
Subject: Re: Query type always Boolean Query even if * and ? are present.
BingoThis has solved my case... Thanks a ton
Please send Solr-related inquiries to the Solr user list - this is the
Lucene (Java) user list.
-- Jack Krupansky
-Original Message-
From: Manuel Le Normand
Sent: Sunday, September 08, 2013 7:03 AM
To: java-user@lucene.apache.org
Subject: Profiling Solr Lucene for query
Hello all
The limit of 2 is hard-coded precisely because good performance for editing
distances above 2 cannot be guaranteed.
-- Jack Krupansky
-Original Message-
From: Michael Tobias
Sent: Wednesday, August 14, 2013 1:00 AM
To: java-user@lucene.apache.org
Subject: Fuzzy Searching on Lucene
Check out the discussion on:
https://issues.apache.org/jira/browse/LUCENE-4583
StraightBytesDocValuesField fails if bytes 32k
-- Jack Krupansky
-Original Message-
From: Nicolas Guyot
Sent: Friday, August 09, 2013 5:57 PM
To: java-user@lucene.apache.org
Subject: DV limited to 32766
it would be nice to be able to emit classic Lucene query parser queries
where possible
Yeah, but then we hit the problem of the Query terms having been through
analysis.
Maybe it would be nice if we had query syntax to indicate that terms had
already been analyzed.
-- Jack Krupansky
Why not start with something simple? Like, index each log line as a
tokenized text field and then do PhraseQuery against that text field? Is
there something else you need beyond that?
-- Jack Krupansky
-Original Message-
From: Ankit Murarka
Sent: Saturday, August 03, 2013 3:22 AM
to
serialization (I've done something similar myself.) This is what is output
in the parsedquery section of debugQuery output for a Solr query response.
-- Jack Krupansky
-Original Message-
From: Denis Bazhenov
Sent: Sunday, July 28, 2013 1:59 AM
To: java-user@lucene.apache.org
Subject
PerFieldAnalyzerWrapper
http://lucene.apache.org/core/4_4_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/PerFieldAnalyzerWrapper.html
This analyzer is used to facilitate scenarios where different fields
require different analysis techniques.
-- Jack Krupansky
-Original
though it matches everything - no
scoring.
-- Jack Krupansky
-Original Message-
From: Arjen van der Meijden
Sent: Thursday, July 25, 2013 3:06 PM
To: java-user@lucene.apache.org
Subject: Re: Performance measurements
Hi Sriram,
I don't see any obvious mistakes, although you don't need
, but more
than that is uncharted territory that risks queries taking more than half
a second or even multiple seconds and requires a proof of concept
implementation to validate reasonable query times.
-- Jack Krupansky
-Original Message-
From: Sriram Sankar
Sent: Wednesday, July 24
-search.
As Adrien indicates, try using raw Lucene filters and you should get much
better results. Whether even that will compete with a use-case-specific
(graph) search engine remains to be seen.
-- Jack Krupansky
-Original Message-
From: Sriram Sankar
Sent: Wednesday, July 24, 2013
be wrapped as a CSQ for search so that no scoring would be done.
-- Jack Krupansky
-Original Message-
From: Sriram Sankar
Sent: Wednesday, July 24, 2013 3:58 PM
To: java-user@lucene.apache.org
Subject: Re: Performance measurements
On Wed, Jul 24, 2013 at 10:24 AM, Jack Krupansky
j
and manipulate and then regenerate a true
source query that doesn't have analysis or enrichment (except as the
application may explicitly have performed on the tree.)
-- Jack Krupansky
-Original Message-
From: Beale, Jim (US-KOP)
Sent: Tuesday, July 23, 2013 10:07 AM
To: java-user
text at query time. What is the exact query text and what are the exact
analyzer tokens for that query text and how many are there?
-- Jack Krupansky
-Original Message-
From: Ankit Murarka
Sent: Monday, July 22, 2013 10:29 AM
To: java-user@lucene.apache.org
Subject: Re: Trying to search
to forget to do that. If you don't, then
you will have to hand-analyze the query string and simulate exactly what the
standard analyzer did at index time.
So, please clarify your situation.
-- Jack Krupansky
-Original Message-
From: Ankit Murarka
Sent: Monday, July 22, 2013 6:24 AM
To: java
Sorry, but you need to resend this message to the Solr user list - this is
the Lucene user list.
-- Jack Krupansky
-Original Message-
From: Beale, Jim (US-KOP)
Sent: Thursday, July 18, 2013 12:34 PM
To: java-user@lucene.apache.org
Subject: Indexing into SolrCloud
Hey folks,
I've
it, or are
you using some other analyzer?
-- Jack Krupansky
-Original Message-
From: ABlaise
Sent: Thursday, July 18, 2013 9:19 PM
To: java-user@lucene.apache.org
Subject: Searching for words begining with or
Hi everyone,
I am new to this forum, I have made some research for my question
(
a, an, and, are, as, at, be, but, by,
for, if, in, into, is, it,
no, not, of, on, or, such,
that, the, their, then, there, these,
they, this, to, was, will, with
);
And... or is on that list! So, the standard analyzer is removing or from
the index! That's why the query can't find it.
Unless you really want these stop words removed, construct your own analyzer
that does not do stop word removal.
-- Jack Krupansky
-Original Message-
From: ABlaise
Sent
class. Unfortunately, that has less Javadoc, although it does cite a
key paper on that approach.
-- Jack Krupansky
-Original Message-
From: Erick Erickson
Sent: Wednesday, July 17, 2013 8:17 AM
To: java-user
Subject: Re: What is text searching algorithm in Lucene 4.3.1
Note: as of Lucene
:
http://en.wikipedia.org/wiki/Query_expansion
Lucid:
http://docs.lucidworks.com/display/help/Unsupervised+Feedback+Options
http://docs.lucidworks.com/display/lweug/Understanding+and+Improving+Relevance#UnderstandingandImprovingRelevance-UnsupervisedFeedback
-- Jack Krupansky
-Original Message
The source code is what most people use to understand how Lucene actually
works. In some cases the Javadoc comments will point to published papers or
web sites for algorithms or approaches.
-- Jack Krupansky
-Original Message-
From: Vinh Đặng
Sent: Tuesday, July 16, 2013 10:54 PM
that anybody on this mailing
list would engage in such an unethical or unprofessional activity.
-- Jack Krupansky
-Original Message-
From: Ramakrishna
Sent: Monday, July 15, 2013 9:13 AM
To: java-user@lucene.apache.org
Subject: Re: [ANNOUNCE] Web Crawler
Hi..
I'm trying nutch to crawl
into underlying Lucene concepts for Solr users, such as the
structure of Query objects and tokenization and token filtering, mostly
since advanced Solr users run into issues there, but those areas are
difficult for Lucene users as well.
My e-book:
http://www.lulu.com/shop/jack-krupansky/solr-4x-deep-dive
are still valid.
-- Jack Krupansky
-Original Message-
From: Ivan Brusic
Sent: Wednesday, July 10, 2013 10:41 AM
To: java-user@lucene.apache.org
Subject: Re: Lucene in Action
Jack, don't you also have a book coming out on O'Reilly?
http://shop.oreilly.com/product/0636920028765.do
Lucene
To be clear, Lucene and Solr are search engines, NOT storage engines.
Has someone claimed otherwise to you?
What is your query performance in in 4.x vs. 3.x? That's the true, proper
measure of Lucene and Solr performance.
-- Jack Krupansky
-Original Message-
From: Chris Zhang
Sent
of the getters, you create a new Query object of
the current type (the type you referenced in the instanceof) and return
that new Query object. Recursion would return the new Query object.
-- Jack Krupansky
-Original Message-
From: Puneet Pawaia
Sent: Saturday, July 06, 2013 12:54 PM
There is a Lucene filter that you can use to check efficiently for whether a
field has a value or not.
new ConstantScoreQuery(new FieldValueFilter(String field, boolean negate))
-- Jack Krupansky
-Original Message-
From: David Carlton
Sent: Wednesday, July 03, 2013 4:27 PM
To: java
Try asking your question on the “Solr user” email list – this is the Lucene
user list!
-- Jack Krupansky
From: Adrien RUFFIE
Sent: Monday, July 01, 2013 4:36 AM
To: java-user@lucene.apache.org
Subject: highlighting component to searchComponent
Hello all I had the following configuration
The user could use a regular expression query to match the numbers, but
otherwise, you will have to write some specialized token filter to recognize
numeric tokens and generate extra tokens at the same position for each token
variant that you want to search for.
-- Jack Krupansky
sendting the document to Solr. Tika also has language detection, so you
could call Tika from an external process before sending the document to
Solr.
-- Jack Krupansky
-Original Message-
From: Hang Mang
Sent: Thursday, June 27, 2013 11:45 AM
To: java-user@lucene.apache.org
Subject
Oops... sorry, I just realized this was on the Lucene-user list. My response
was for Solr-ONLY!
-- Jack Krupansky
-Original Message-
From: Jack Krupansky
Sent: Thursday, June 27, 2013 1:11 PM
To: java-user@lucene.apache.org
Subject: Re: Language detection
You can use
do a better job with punctuation.
-- Jack Krupansky
-Original Message-
From: Todd Hunt
Sent: Thursday, June 27, 2013 1:14 PM
To: java-user@lucene.apache.org
Subject: Questions about doing a full text search with numeric values
I am working on an application that is using Tika to index
is to retrieve an encrypted blob based on an
encrypted key, why are you even considering Lucene?
-- Jack Krupansky
-Original Message-
From: Rafaela Voiculescu
Sent: Wednesday, June 26, 2013 5:06 AM
To: java-user@lucene.apache.org
Subject: Re: Securing stored data using Lucene
Hello,
Thank you
Try starting with Solr. You can have your search server up and running
without writing any code. And Solr's Data Import Handler can load data
direct from the database.
-- Jack Krupansky
-Original Message-
From: raghavendra.k@barclays.com
Sent: Monday, June 17, 2013 5:03 PM
technique for detecting plagiarism where a lot of the
text is similar if not identical.
Once you get experience using this technique in Solr, then simply look at
the parsed query that edismax generates and do the same in your Lucene Java
code.
-- Jack Krupansky
-Original Message-
From
code using Lucene. Otherwise, you won't have enough
context to understand or even ask intelligent questions.
-- Jack Krupansky
-Original Message-
From: nikhil desai
Sent: Monday, June 10, 2013 1:24 PM
To: java-user@lucene.apache.org
Subject: Lucene Indexes explanantion
Hello,
My
Your stored value could be very different from your indexed (searchable)
value. You can also associate payloads with an indexed term. And there are
DocValues as well.
-- Jack Krupansky
-Original Message-
From: nikhil desai
Sent: Monday, June 10, 2013 8:06 PM
To: java-user
:
- Indexed terms
- Stored values
- Payloads
- DocValues
-- Jack Krupansky
-Original Message-
From: nikhil desai
Sent: Monday, June 10, 2013 8:36 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene Indexes explanantion
I don't think I could get much from what you said, could you please
It might be nice to inquire as to the largest position for a field in a
document. Is that information kept anywhere? Not that I know of, although I
suppose it can be calculated at runtime by running though all the terms of
the field. Then he could just divide by 1000.
-- Jack Krupansky
Take a look at the Term Vectors Component:
http://wiki.apache.org/solr/TermVectorComponent
-- Jack Krupansky
-Original Message-
From: Igor Shalyminov
Sent: Thursday, May 23, 2013 9:54 AM
To: java-user@lucene.apache.org
Subject: Re: Getting position increments directly from
If you add a special end of document term then some of these calculations
might be easier.
And, give that special term a payload of the sentence count.
While you're at it, insert end of sentence terms that could have a a
payload of the sentence number.
-- Jack Krupansky
-Original
to trim exterior white space and normalize interior white
space.
-- Jack Krupansky
-Original Message-
From: Shahak Nagiel
Sent: Tuesday, May 21, 2013 10:06 AM
To: java-user@lucene.apache.org
Subject: Case insensitive StringField?
It appears that StringField instances are treated
Just escape embedded spaces with a backslash.
-- Jack Krupansky
-Original Message-
From: Ross Simpson
Sent: Tuesday, May 21, 2013 8:08 PM
To: java-user@lucene.apache.org
Subject: Query with phrases, wildcards and fuzziness
Hi all,
I'm trying to create a fairly complex query
Yes it is. It always will. But... you can escape the spaces with a
backslash:
Query q = qp.parse(new\\ york);
-- Jack Krupansky
-Original Message-
From: Shahak Nagiel
Sent: Tuesday, May 21, 2013 10:09 PM
To: java-user@lucene.apache.org
Subject: Re: Case insensitive StringField?
Jack
Yeah, just go ahead and escape the slash, either with a backslash or by
enclosing the whole term in quotes. Otherwise the slash (even embedded in
the middle of a term!) indicates the start of a regex query term.
-- Jack Krupansky
-Original Message-
From: Scott Smith
Sent: Sunday
Revolution, but my proposal was not accepted.)
See:
http://www.datastax.com/what-we-offer/products-services/datastax-enterprise
As it says, DataStax Enterprise is completely free for development work.
-- Jack Krupansky
-Original Message-
From: Rider Carrion Cleger
Sent: Tuesday, May 14, 2013
a Jira for a new Lucene Query for phrase and or span
queries that measures distance by offsets rather than positions.
-- Jack Krupansky
-Original Message-
From: wgggfiy
Sent: Monday, May 13, 2013 3:47 AM
To: java-user@lucene.apache.org
Subject: Re: [PhraseQuery] Can jakarta apache~10
Do you mean the raw character offsets of the starting and ending characters
of the terms?
No.
Although, if you index the text as a raw string, you might be able to come
up with a regex query like jakarta.{1,10}apache
-- Jack Krupansky
-Original Message-
From: wgggfiy
Sent: Monday
length, say 500. The SpanNearQuery would use n-1 times the sentence
separation plus the maximum sentence length. Well, you have to adjust that
for how you count sentences - is 1 the current sentence or is that 0?
-- Jack Krupansky
-Original Message-
From: Igor Shalyminov
Sent: Thursday
I didn't read your code, but do you have the reset that is now mandatory
and throws AIOOBE if not present?
-- Jack Krupansky
-Original Message-
From: andi rexha
Sent: Monday, April 15, 2013 10:21 AM
To: java-user@lucene.apache.org
Subject: WhitespaceTokenizer, incrementToke
of a
contract violation). So, I should have said that the contract was mandatory
but not enforced... which from a practical perspective negates its mandatory
contractual value.
-- Jack Krupansky
-Original Message-
From: Uwe Schindler
Sent: Monday, April 15, 2013 11:53 AM
To: java-user
,
or maybe even send each file directly into Tika and then directly index the
content into Lucene, if that's what you want. In any case, MCF handles the
SharePoint access and crawling.
See:
http://manifoldcf.apache.org/en_US/index.html
-- Jack Krupansky
-Original Message-
From: Álvaro
In a statistical sense, for the majority of documents, yes, but you could
probably find quite a few outlier examples where the results from A to B or
from B to A as significantly or even completely different or even
non-existent.
-- Jack Krupansky
-Original Message-
From: Peter
or
terrible - the selected/query document may not have any representation in
the target corpus.
-- Jack Krupansky
-Original Message-
From: Peter Lavin
Sent: Thursday, April 04, 2013 1:06 PM
To: java-user@lucene.apache.org
Subject: MLT Using a Query created in a different index
Dear Users,
I
whether v(i) or v(j) are or are not present as keywords.
-- Jack Krupansky
-Original Message-
From: Paul Bell
Sent: Sunday, March 31, 2013 8:21 AM
To: java-user@lucene.apache.org
Subject: Indexing a long list
Hi All,
Suppose I need to index a property whose value is a long list of terms
Multivalued fields are the other approach to keyword value pairs.
And if you can denormalize your data, storing structure as separate
documents can make sense and support more powerful queries. Although the
join capabilities are rather limited.
-- Jack Krupansky
-Original Message
I don't think there is a way of identifying which of the values of a
multivalued field matched. But... I haven't checked the code to be
absolutely certain whether their isn't some expert way.
Also, realize that multiple values could match, such as if you queried for
B*.
-- Jack Krupansky
Try the ASCII Folding FIlter:
https://lucene.apache.org/core/4_2_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html
-- Jack Krupansky
-Original Message-
From: Jerome Blouin
Sent: Friday, March 22, 2013 12:22 PM
To: java-user@lucene.apache.org
Subject
Start with the Standard Tokenizer:
https://lucene.apache.org/core/4_2_0/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizer.html
-- Jack Krupansky
-Original Message-
From: Jerome Blouin
Sent: Friday, March 22, 2013 12:53 PM
To: java-user@lucene.apache.org
Subject
I don't have time right now to debug your code right now, but make sure that
the analysis is consistent between index and query. For example, Apache
vs. apache.
-- Jack Krupansky
-Original Message-
From: Bratislav Stojanovic
Sent: Saturday, March 16, 2013 7:29 AM
To: java-user
Could you give us some examples of what you expect? I mean, how is your
suggested set of documents any different from simply executing a query with
the list of suggested terms (using q.op=OR)?
Or, maybe you want something like MoreLikeThis?
-- Jack Krupansky
-Original Message-
From
Let's refine this...
If a top suggestion is X, do you simply want to know a few of the documents
which have the highest term frequency for X?
Or is there some other term-oriented metric you might propose?
-- Jack Krupansky
-Original Message-
From: Bratislav Stojanovic
Sent
Try detailing both your expected behavior and the actual behavior. Try
providing an actual code snippet and actual index and query data. Is it
failing for all types and titles or just for some?
-- Jack Krupansky
-Original Message-
From: saisantoshi
Sent: Tuesday, February 26, 2013 6
Any reason that you are not using the DirectSpellChecker?
See:
http://lucene.apache.org/core/4_0_0/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html
-- Jack Krupansky
-Original Message-
From: Samuel García Martínez
Sent: Wednesday, February 20, 2013 3:34 PM
To: java-user
Please clarify exactly what you want to group by - give a specific example
that makes it clear what terms should affect grouping and which shouldn't.
-- Jack Krupansky
-Original Message-
From: Ramprakash Ramamoorthy
Sent: Monday, February 18, 2013 6:12 AM
To: java-user
Okay, so, fields that would normally need to be tokenized must be stored as
both raw strings for grouping and tokenized text for keyword search. Simply
use copyField to copy from one to the other.
-- Jack Krupansky
-Original Message-
From: Ramprakash Ramamoorthy
Sent: Monday
Oops, sorry for the Solr answer. In Lucene you need to simply index the
same value, once as a raw string and a second time as a tokenized text
field. Grouping would use the raw string version of the data.
-- Jack Krupansky
-Original Message-
From: Jack Krupansky
Sent: Monday
a document if EITHER term matches. So, if
NEITHER matches (within an editing distance of 2), the document is not a
match.
-- Jack Krupansky
-Original Message-
From: Pierre Antoine DuBoDeNa
Sent: Saturday, February 09, 2013 12:52 PM
To: java-user@lucene.apache.org
Subject: Re: fuzzy queries
an analyzer that preserves them since they generally will be treated
as spaces.
-- Jack Krupansky
-Original Message-
From: Nicolas Roduit
Sent: Friday, February 08, 2013 2:49 AM
To: java-user@lucene.apache.org
Subject: Wildcard in a text field
I'm looking for a way of making a query
Ah, okay... some people call that prospective search. In any case, there
is no direct Lucene support that I know of.
There are some references here:
http://lucene.apache.org/core/4_0_0/memory/org/apache/lucene/index/memory/MemoryIndex.html
-- Jack Krupansky
-Original Message-
From
of the box.
-- Jack Krupansky
-Original Message-
From: Andrew Gilmartin
Sent: Thursday, January 31, 2013 9:04 AM
To: java-user@lucene.apache.org
Subject: Re: How to find related words ?
wgggfiy wrote:
en, it seems nice, but I'm puzzled by you and Andrew Gilmartina above,
what's the difference
keyword(s) and ask Lucene to
extract relevant terms from the top document(s).
-- Jack Krupansky
-Original Message-
From: wgggfiy
Sent: Wednesday, January 30, 2013 12:27 PM
To: java-user@lucene.apache.org
Subject: How to find related words ?
In short,
you put in a term like Lucene
That depends on the value of ed, and the indexed data.
Another factor to take into consideration is that a case change (Star vs.
star) also counts as an edit.
-- Jack Krupansky
-Original Message-
From: George Kelvin
Sent: Tuesday, January 29, 2013 11:49 AM
To: java-user
and the query -
all the literals. In other words, construct a minimal test case that shows
the failure.
-- Jack Krupansky
-Original Message-
From: George Kelvin
Sent: Tuesday, January 29, 2013 12:28 PM
To: java-user@lucene.apache.org
Subject: Re: Questions about FuzzyQuery in Lucene 4.x
Let's see your code that calls FuzzyQuery . If you happen to pass a
prefixLength (3rd parameter) of 3 or more, then ster would not match
star (but prefixLength of 2 would match).
-- Jack Krupansky
-Original Message-
From: George Kelvin
Sent: Monday, January 28, 2013 5:31 PM
To: java
it works:
http://wiki.apache.org/solr/ExtractingRequestHandler
-- Jack Krupansky
-Original Message-
From: Adrien Grand
Sent: Sunday, January 27, 2013 12:53 PM
To: java-user@lucene.apache.org
Subject: Re: Readers for extracting textual info from pd/doc/excel for
indexing the actual content
it, and Solr is based on Lucene.
-- Jack Krupansky
-Original Message-
From: saisantoshi
Sent: Sunday, January 27, 2013 2:09 PM
To: java-user@lucene.apache.org
Subject: Re: Readers for extracting textual info from pd/doc/excel for
indexing the actual content
We are not using Solr
Send the same input text to two different analyzers for two separate fields.
The first analyzer emits only the first attribute. The second analyzer emits
only the second attribute. The document position in one will correspond to
the document position in the other.
-- Jack Krupansky
+1
I think that accurately states the semantics of the operation you want.
-- Jack Krupansky
-Original Message-
From: Alan Woodward
Sent: Friday, January 18, 2013 1:08 PM
To: java-user@lucene.apache.org
Subject: Re: SpanNearQuery with two boundaries
Hi Igor,
You could try wrapping
You need to express the boolean query solely in terms of SpanOrQuery and
SpanNearQuery. If you can't, ... then it probably can't be done, but you
should be able to.
How about starting with a plan English description of the problem you are
trying to solve?
-- Jack Krupansky
-Original
exclude terms from a span.
-- Jack Krupansky
-Original Message-
From: Michel Conrad
Sent: Thursday, January 17, 2013 12:14 PM
To: java-user@lucene.apache.org
Subject: Re: Combine two BooleanQueries by a SpanNearQuery.
The problem I would like to solve is to have two queries that I will
get
setMinTermFreq. The default is 2. You don't have any terms with a
term frequency above 1.
-- Jack Krupansky
-Original Message-
From: Thomas Keller
Sent: Tuesday, January 15, 2013 3:22 PM
To: java-user@lucene.apache.org
Subject: Lucene-MoreLikethis
Hey,
I have a question about
FWIW, new FuzzyQuery(term, 2 ,0) is the same as new FuzzyQuery(term), given
the current values of defaultMaxEdits (2) and defaultPrefixLength (0).
-- Jack Krupansky
-Original Message-
From: Ian Lea
Sent: Wednesday, January 09, 2013 9:44 AM
To: java-user@lucene.apache.org
Subject: Re
MoreLikeThis?
Or, possibly arv appears later in a document on the second run, after the
number of tokens specified by maxNumTokensParsed.
-- Jack Krupansky
-Original Message-
From: Peter Lavin
Sent: Tuesday, January 08, 2013 1:46 PM
To: java-user@lucene.apache.org
Subject: Differences in MLT
You need a reset method that calls the super reset to reset the parent
state and then reset your own state.
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/analysis/TokenStream.html#reset()
You probably don't have one, so only the parent state gets reset.
-- Jack Krupansky
Ah! You're quoting full phrases. You weren't clear about that originally.
Thanks for the clarification.
-- Jack Krupansky
-Original Message-
From: Tom
Sent: Wednesday, December 26, 2012 5:54 PM
To: java-user@lucene.apache.org
Subject: Re: Which token filter can combine 2 terms into 1
:
String querystr = product:(Spring Framework Core)
vendor:(SpringSource);
to
String querystr = product:\Spring Framework Core\
vendor:(SpringSource);
-- Jack Krupansky
-Original Message-
From: Jeremy Long
Sent: Wednesday, December 26, 2012 5:52 PM
To: java-user
(org.apache.lucene.search.Query,
int)
-- Jack Krupansky
-Original Message-
From: Vishwas Goel
Sent: Tuesday, December 25, 2012 11:30 PM
To: java-user@lucene.apache.org
Subject: Retrieving granular scores back from Lucene/SOLR
Hi,
I am looking to get a bit more information back from SOLR/Lucene
And to be more specific, most query parsers will have already separated the
terms and will call the analyzer with only one term at a time, so no term
recombination is possible for those parsed terms, at query time.
-- Jack Krupansky
-Original Message-
From: Erick Erickson
Sent
You still have the query parser's parsing before analysis to deal with, no
matter what magic you code in your analyzer.
-- Jack Krupansky
-Original Message-
From: Tom
Sent: Friday, December 21, 2012 2:24 PM
To: java-user@lucene.apache.org
Subject: Re: Which token filter can combine 2
/4_0_0/core/org/apache/lucene/search/BooleanQuery.html#setMinimumNumberShouldMatch(int)
-- Jack Krupansky
-Original Message-
From: 김한규
Sent: Wednesday, December 19, 2012 2:36 AM
To: java-user@lucene.apache.org
Subject: NGramPhraseQuery with missing terms
Hi.
I am trying to make
that fail.
-- Jack Krupansky
-Original Message-
From: Ramon Casha
Sent: Tuesday, December 18, 2012 9:14 AM
To: java-user@lucene.apache.org
Subject: Help needed: search is returning no results
I have just downloaded and set up Lucene 4.0.0 to implement a search
facility for a web app I'm
table that trie keeps beyond the
raw data values or the data values themselves.
-- Jack Krupansky
Thanks, you answered the main question - 26 doesn't simply lop off the time
of day. Although, I still don't completely follow how trie works (without
reading the paper itself.)
-- Jack Krupansky
-Original Message-
From: Uwe Schindler
Sent: Friday, December 14, 2012 5:58 PM
To: java
, but unexpected term.
-- Jack Krupansky
-Original Message-
From: Carsten Schnober
Sent: Thursday, December 13, 2012 10:49 AM
To: java-user@lucene.apache.org
Subject: Boolean and SpanQuery: different results
Hi,
I'm following Grant's advice on how to combine BooleanQuery and
SpanQuery
(http://mail
101 - 200 of 284 matches
Mail list logo