If you don't have a lot of entries for each invoice you can duplicate the
invoice for each entry - you'll have some field duplications (and bigger
index size) between the different invoices but it'll be easy to find exactly
what you want.
If you have too many different values, I built a solution
[EMAIL PROTECTED] wrote:
On an index of around 20 gigs I've been seeing a performance drop of
around 35% after upgrading to 2.4 (measured on ~1 requests
identical requests, executed in parallel against a threaded lucene /
apache setup, after a roughly 1 query warmup). The principal
Thanks for the quick answer!
I haven't specified the analyzer so it should be the StandardAnalyzer. I
forgot to mention that I'm using Lucene via Hibernate seach where I can
easily define the fields in the hibernate POJO-classes. But as far as I
know this shouldn't change things that much
Hi,
Here are some differences I noticed between InstanciatedIndex and
RAMDirectory :
- RAMDirectory seems to do a reset on tokenStreams the first time, this
permits to initialise some objects before starting streaming,
InstanciatedIndex does not.
- I can Serialize a RAMDirectory but I
I'm going to have to punt on what Hibernate does/doesn't do since I have no
experience there.
But in general analyzers are very important. StandardAnalyzer, for instance,
tries
to recognize e-mail addresses. So it'll create some very interesting tokens,
some
that are unexpected unless you really
Hi David,
thanks for the report! I suppose you speak of IndexWriter vs
InstantiatedIndexWriter? These are definitely considered discrepancy
problems. I've created a new issue in the tracker:
http://issues.apache.org/jira/browse/LUCENE-1462
For what reason do you try to serialize the
Hi Karl,
The reset() problem is not very problematic I can adapt our TokenStreams.
For the Serialization : as we need to share very small indexes (200 docs
max) in a cluster we need to serialize something.
I was planning to use the Java Serialization with maybe some compression
on the
Hi everybody,
as far as I know the lucene score is an arbitrary number between 0.0 and
1.0.
Is this correct, that the scores in my resultset are always normalised to
this spread or is it possible to get higher scores?
Regards,
John W.
--
View this message in context:
excitingComm2 wrote:
Hi everybody,
as far as I know the lucene score is an arbitrary number between 0.0 and
1.0.
Is this correct, that the scores in my resultset are always normalised to
this spread or is it possible to get higher scores?
Regards,
John W.
Hits is the class that did the
Hi everybody:
I need to make search with lucene 2.3.2, taking in account the dates,
previously when I build the index I create a date field where I stored the
year in which the document was created, at the search moment I would like to
retrieve documents that have been created before a Year or
Hi - sounds like you need a range query.
http://lucene.apache.org/java/2_3_2/queryparsersyntax.html#Range%20Searches
--
Ian.
On Wed, Nov 19, 2008 at 4:02 PM, Ariel [EMAIL PROTECTED] wrote:
Hi everybody:
I need to make search with lucene 2.3.2, taking in account the dates,
previously when
Hello All
I´m writing an application to move full text search out of my rdbms. Today
the app hits the db two times. 1) to do the search it self. 2) to format
the output of the search results. In my plan I´m moving everything to
lucene documents that contain fields where I will be doing the
Hello,
Is there anyway to obtain a raw hit score?
I understand the deprecated Hits.getScore()
returns normalized scores, relative to each
query. Is TopDocs.scoreDocs[i].score
also normalized, or raw?
I'd like to compare confidence levels
of hits among different queries.
Thanks.
T. Kuro
Op Wednesday 19 November 2008 03:39:01 schreef [EMAIL PROTECTED]:
...
Our design is roughly as follows: we have some pre-query filters,
queries typically involving around 25 clauses, and some
post-processing of hits. We collect counts and filter post query
using a hit collector, which uses
Thanks, that was very helpful, but I have a question when I make the
searches it does not sort the results according to the range, for example:
year: [2003 TO 2008] in the first page 2003 documents are showed, in the
second 2005 documents, in the third page 2004 documents, I don't see any
sort
I have a couple quick questions...it might just be because I haven't looked
at this in a week now (got pulled away onto some other stuff that had to
take priority).
In the searching phase, I would run the search across all page documents,
and then for each of those pages, do a search with
Please ignore this question.
I've noticed it was answered in
another thread just before
I posted my question.
Answer: use TopDocs.scoredocs[i].score
T. Kuro Kurosaka, Basis Technology
San Francisco, California, U.S.A.
-
It's more than possible, it's probable. Cache thrashing would definitely be
my first guess; with so many copies of the exact same data you're not only
missing out on significant gains with the L2 cache, you're also taking a
major hit with every cache miss (which probably happens every context
it is supposed lucene make a lexicocraphic sorting but this is not hapening,
Could you tell me what I'm doing wrong ?
I hope you can help me.
Regards
On Wed, Nov 19, 2008 at 11:56 AM, Ariel [EMAIL PROTECTED] wrote:
Thanks, that was very helpful, but I have a question when I make the
searches
Tim,
Op Wednesday 19 November 2008 02:32:40 schreef Tim Sturge:
...
This is less than 2x slower than the dedicated bitset and more
than 50x faster than the range boolean query.
Mike, Paul, I'm happy to contribute this (ugly but working) code
if there is interest. Let me know and I'll
Are you using one of the search methods that includes sorting? If
not, then do. If you are, then you need to tell us exactly what you
are doing and exactly what you reckon is going wrong.
--
Ian.
On Wed, Nov 19, 2008 at 6:23 PM, Ariel [EMAIL PROTECTED] wrote:
it is supposed lucene make a
Well, this is what I am doing:
queryString=year:[2003 TO 2005]
[CODE]
Query pquery = null;
Hits hits = null;
Analyzer analyzer = null;
analyzer = new SnowballAnalyzer(English);
try {
pquery = MultiFieldQueryParser.parse(new String[] {queryString,
queryString}, new
Well, MultiSearcher is just a Searcher, so you have available
all of the search methods on Searcher. One of which is:
search
public TopFieldDocs
file:///C:/lucene-2.1.0/docs/api/org/apache/lucene/search/TopFieldDocs.html
*search*(Query
Unfortunately, not yet. There have been discussions about this,
including this issue for column-stride fields:
https://issues.apache.org/jira/browse/LUCENE-1231
But no real progress on it lately...
Mike
Diego Cassinera wrote:
Hello All
I´m writing an application to move full
hi
Is there any documentation that says that scores obtained from
TopDocs.scoredocs[i].score
are comparable across queries. I am having this problem myself so I would
really appreciate if anyone has some pointers to this.
At [1], it seems like they are not. Is there any solution to enable this
25 matches
Mail list logo