.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de eMail:
u...@thetaphi.de
-Original Message- From: Christian Reuschling
[mailto:reuschl...@dfki.uni-kl.de]
Sent: Monday, October 06, 2014 6:06 PM To: java-user@lucene.apache.org
Subject:
query.extractTerms(..) on rewritten queries
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
currently I migrate to Lucene 4. In the past, I did a trick to get the index
specific terms for an
according (wildcard) query (see below). But it don't works anymore:
String queryString = n*; // gives no result
// String queryString = nöä; //
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hello,
I try to get the scorer for a result document, for further computation.
ListAtomicReaderContext leafContexts = indexReader.leaves();
int n = ReaderUtil.subIndex(scoreDoc.doc, leafContexts);
AtomicReaderContext ctx = leafContexts.get(n);
, at 10:17 AM, Christian Reuschling
reuschl...@dfki.uni-kl.de wrote:
We currently migrate one project to Lucene 4 and noticed that the method
IndexSearcher.setDefaultFieldSortScoring(..) disappeared in Lucene 4.0. We
can't find
something about this in the migration guide. Further
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
We currently migrate one project to Lucene 4 and noticed that the method
IndexSearcher.setDefaultFieldSortScoring(..) disappeared in Lucene 4.0. We
can't find something
about this in the migration guide. Further, it was never deprecated in Lucene
3,
an exotic case. Or
is it?
Thanks from the whole DFKI Lucene crew!
Christian
- --
__
Christian Reuschling, Dipl.-Ing.(BA)
Software Engineer
Knowledge Management Department
German Research Center for Artificial
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
I remember that there was a general Searcher interface, with the standard
IndexSearcher as
subclass, plus some subclass that enabled RMI-based remote access to an index.
In the case you used Searcher in your codebase, the code was independent from
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
I have a small set of document numbers as a query result collected with some
non-scoring collector.
Now, I want to send high-performant successive queries only in this document
number scope, as part
of a customized Similarity implementation
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hello,
what is the best method to score documents similar to default similarity, but
the document
frequency should be calculated per query against the matching result document
set, not statically
against the whole corpus.
Didn't found a good and
very complex.
Thanks a lot!
Christian Reuschling
On 15.11.2013 18:49, Michael McCandless wrote:
Hmm, I'm not sure offhand why that change gives you no results.
The fullPrefixPaths should have been a super-set of the original
prefix paths, since the LevA just adds further paths.
Mike
?
On 14.11.2013 17:05, Michael McCandless wrote:
On Wed, Nov 13, 2013 at 12:04 PM, Christian Reuschling
christian.reuschl...@gmail.com wrote:
We started to implement a named entity recognition on the base of
AnalyzingSuggester, which
offers the great support for Synonyms, Stopwords, etc
We started to implement a named entity recognition on the base of
AnalyzingSuggester, which offers
the great support for Synonyms, Stopwords, etc.
For this, we slightly modified AnalyzingSuggester.lookup() to only return the
exactFirst hits
(considering the exactFirst code block only, skipping
placeholder value (like -1,
infinity, NaN). If you only need it in the stored fields, just store it but
don't index it.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Christian Reuschling
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Christian Reuschling [mailto:christian.reuschl...@gmail.com]
Sent: Wednesday, February 15, 2012 12:58 PM
To: java-user
Subject: Empty numeric field
Hi all,
for some reason, we need empty numeric
values by looking at the first
few bits, which contains the precision.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
-Original Message-
From: Christian Reuschling [mailto:christian.reuschl...@gmail.com]
Sent: Monday
presume) to int or long or whatever.
Maybe that will help.
--
Ian.
On Wed, Nov 2, 2011 at 7:19 PM, Christian Reuschling
christian.reuschl...@gmail.com wrote:
Hi,
maybe it is an easy question - I searched over the lucene-user
archive, but sadly didn't found an answer :(
I currently
.
Maybe that will help.
--
Ian.
On Wed, Nov 2, 2011 at 7:19 PM, Christian Reuschling
christian.reuschl...@gmail.com wrote:
Hi,
maybe it is an easy question - I searched over the lucene-user
archive, but sadly didn't found an answer :(
I currently change our field logic from string
Hi,
maybe it is an easy question - I searched over the lucene-user
archive, but sadly didn't found an answer :(
I currently change our field logic from string- to numeric fields.
Until now, I managed to find the min-max values of a field by
iterating over the field with a TermEnum
(termEnum =
Hello Michael,
I also would prefer B - it also shortens the time to have a benefit of new
Lucene features in our applications.
It forces our lazy programmers (I am of course ;) ) to deal with them - and
reduces the efford to change to a major release afterwards.
Maybe some minimum time waiting
Hi guys,
in our app we gives the possibility to search inside a set of documents, which
is the result list of a former search. Thus, someone can shrink down a search
according different criterias.
For this, we implemented a simple Filter that simply gets a TopDocs Object and
creates a bitSet out
Hi,
our application enables sorting the result lists according to field values,
currently all represented as Strings (we plan to also migrate to the new
numeric type capabilities of Lucene 2.9 at a later time)
For this, the documents will be sorted e.g. according to the author, which
works fine
Hi,
I had similar behaviour. On an self-build index on german wikipedia I searched
for the phrase blaue blume. I've got 2 results. When I searched for +blaue
blume vogel I've got 59 results...strange.
I found out that when I create a plain BooleanQuery with just the phrase blaue
blume gives
Hi,
looking up the different terms with a common stem can be useful in different
scenarios - so I don't want to judge it whether someone needs it or not.
E.g., in the case you have multilingual documents in your index, it is straight
forward to determine the language of the documents in order to
Hi Prashant,
we let convergate the scores to 1 - whereby they will never reach one, to have
also correct ratings with respect to higher Lucene scores which are more
or less open-ended:
normalizedScore = 1 - [ 1 / (1+luceneScore) ]
best
Christian
On Sun, 16 Aug 2009 19:04:44 +0530
prashant
Hello,
when searching over multiple indices, we create one IndexReader for each index,
and wrap them into a MultiReader, that we use for IndexSearcher creation.
This is fine for searching multiple indices on one machine, but in the case the
indices are distributed over the (intra)net, this
Is there a fast way to determine the total number of terms inside an index?
Currently I only found the way to walk through the TermEnumeration, i.e.
TermEnum termEnum4TermCount = reader.terms();
int iTermCount = 0;
while (termEnum4TermCount.next())
iTermCount++;
termEnum4TermCount.close();
is a statement of the problem you're trying to solve, because I'm
having trouble understanding the underlying use-cases..
Best
Erick
On Wed, Nov 12, 2008 at 10:17 AM, Christian Reuschling
[EMAIL PROTECTED] wrote:
Hello Erick,
thank you very much for this interesting idea - but I'm
behaviour, you need some kind of logical 'grouping' of one
dataset.
whereby a query 'term1 term4' should NOT match, 'term1 term2' must match.
Stefan Trcek schrieb:
On Wednesday 12 November 2008 14:58:53 Christian Reuschling wrote:
In order to offer some simple 1:n matching, currently we create
not being important:
attName:startDelimiter myterm2 myterm1 endDelimiter...should also match
Did you really mean to have myterm2 in front of myterm1?
Best
Erick
On Wed, Nov 12, 2008 at 8:58 AM, Christian Reuschling
[EMAIL PROTECTED] wrote:
Hello Friends,
In order to offer some
, or do I
have to write my own Query implementation - and what would be the best way in
this case.
Thanks in advance
Christian Reuschling
signature.asc
Description: OpenPGP digital signature
, greetings
Christian Reuschling
package org.dynaq;
import org.apache.lucene.analysis.KeywordAnalyzer;
import org.apache.lucene.analysis.PerFieldAnalyzerWrapper;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import
in the past, I made really good experiences with the svn versions of lucene -
I never had problems, and everything feeled stable.
Currently, I get unexpected exceptions from time to time:
java.lang.RuntimeException: after flush: fdx size mismatch: 1 docs vs 0 length
in bytes of _3g6n.fdx
Hello people,
yes, there were several threads about this topic, but I sadly have to respawn
it, I'm sorry.
The first I found was a discussion from May 2005:
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200505.mbox/[EMAIL
PROTECTED]
There the final solution suggestion from Hoss
Hello people,
I'm sorry if I have send this message twice - my gmail interface merges the
mails in the 'send' folder with incoming mails from my adress - strange, but
I can't say if the mail was sent - I only see it in the send-folder (with
only one label on it, which brings me to send it again
Hello out there,
We have implemented some open source desktop searching app based on Lucene
http://sourceforge.net/projects/dynaq
Development always goes further, and currently we make experiments with the
file-lock based writer (/reader) synchronization capabilities of Lucene, in
order to
Hello out there,
We have implemented some open source desktop searching app based on Lucene
http://sourceforge.net/projects/dynaq
Development always goes further, and currently we make experiments with the
file-lock based writer (/reader) synchronization capabilities of Lucene, in
order to
Hello out there,
We have implemented some open source desktop searching app based on Lucene
http://sourceforge.net/projects/dynaq
Development always goes further, and currently we make experiments with the
file-lock based writer (/reader) synchronization capabilities of Lucene, in
order to
yes, look at the 'contributions' link at the lucene-homepage.
The 'Phonetix'-project provides an implementation for soudex,
metaphor and double-metaphor. Simply use their analyzer. I am
not sure what the behaviour is in the case of wildcards. Have
anyone an answer?
regards
Christian
Steven
38 matches
Mail list logo