Yes, I failed to notice that the removal of the slash was yet another
instance of the analyzer transforming its input. But the bottom line is that
you must do 100% of the same steps that analysis performs. If in doubt, pass
your literals through the standard analyzer itself.
-- Jack Krupansky
I tried changing the case to lower case, but still the BooleanQuery doesn't
return any documents.
I see that the text "/blank" is converted to "blank" in the QueryParser.
But in BooleanQuery it remains the same. When I remove the forward slash
sign from the input string, I get the matched document
On the boosting approach, you can have a mandatory field of title match and
optional match of userId with very high boost. This would have duplicates but
you don't need to do sorting to remove it. Just keep adding the result in the
order it comes and if you see that the title is already there in
Hello Everyone,
We have a legacy system which uses lucene 2.4.1. We have ported a small hack
to lucene source code back then, so that the underlying lucene segment
merger code wouldn't reuse deleted docids. This helped us use lucene docids
as persistent dbids as well. But we want to upgrade lucene
Thanks for the reply. I thought of using boosting, for example "((userId:14
AND title:have)^10 OR (title:have))" or "((userId:14^10 AND title:have) OR
(title:have))" or something like that. However, there would still be
duplicates (all 3 docs for "To Have and To Have Not" would be included whe
Hmmm, what about simply boosting very high on owner, and probably
grouping on title?
If you boosted on owner, you wouldn't even have to index the title
separately for each user, your "owner" field could be multivalued and
contain _all_ the owner IDs. In that case you wouldn't have to group
at all.
The query parser/analyzer is lower-casing the query terms automatically. You
have to do the same with with terms for BooleanQuery - Term("cs-method",
"GET") should be "Term("cs-method", "get")".
StandardAnalyzer is doing the lower-casing.
-- Jack Krupansky
-Original Message-
From: De
Hi,
I have following dataset indexed in Lucene.
2010-04-21 02:24:01 GET /blank 200 120
2010-04-21 02:24:01 GET /US/registrationFrame 200 605
2010-04-21 02:24:02 GET /US/kids/boys 200 785
2010-04-21 02:24:02 POST /blank 304 56
2010-04-21 02:24:04 GET /blank 304 233
2010-04-21 02:24:04 GET /blank 50
SpanNearQuery can be used to allow an arbitrary number of terms between
sub-phrases of a larger phrase. But, that is between terms, not at the
beginning or end of a phrase.
See:
http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/spans/SpanNearQuery.html
You can use SpanMulti
I also posted this to StackOverflow, apologies if you see this twice.
I have a data set whereby documents are associated to a user id. Say that the
documents represent books, and each book can have one or more owner. I am
indexing the titles with Lucene. When searching, I want all results owned
22 July 2012, Apache LuceneT 3.6.1 available
The Lucene PMC is pleased to announce the release of Apache Lucene 3.6.1.
Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-
It can be both.
-Original Message-
From: Doron Yaacoby [mailto:dor...@gingersoftware.com]
Sent: יום א 22 יולי 2012 11:48
To: java-user@lucene.apache.org
Subject: RE: using phrase query with wildcard
Is * a placeholder for a term or a part of a term?
-Original Message-
From: Levi
Is * a placeholder for a term or a part of a term?
-Original Message-
From: Levin, Ilya [mailto:ilya.le...@hp.com]
Sent: 22 July 2012 11:29
To: java-user@lucene.apache.org
Subject: using phrase query with wildcard
Hi,
I'm trying to create a phrase query with wildcard, from the forums it
Hi,
I'm trying to create a phrase query with wildcard, from the forums it seems
that the solution is not trivial.
I'm trying to create the following queries: "this is a phrase*" OR "*This is
a phrase" and
Get hits on every possibility where the * resides.
What is the best way to achieve this?
14 matches
Mail list logo