While we are in constant sync due to the merge, lucene would still be
updated multiple times before a solr 4 release, and it would be subject to
happen at any time - so its really not any different.
On Wednesday, December 7, 2011, Jamie Johnson wrote:
> Yeah, biggest issue for us is we're using t
Can you file a JIRA Markus? This is probably related to the new code that uses
Directory for replication.
- Mark
On Nov 2, 2012, at 6:53 AM, Markus Jelsma wrote:
> Hi,
>
> For what it's worth, we have seen similar issues with Lucene/Solr from this
> week's trunk. The issue manifests itself w
We are hoping for 4.1 very soon! With the holidays it will be difficult to say
- but 4.1 talk has been going on for some time now. Its really a matter of
wrapping up some short term work and getting some guys to do the release work.
I dont think anyone can give you a date, but it's certainly in
If anyone is able to donate some effort, a nice future scenario could be that
Luke comes fully up to date with every Lucene release:
https://issues.apache.org/jira/browse/LUCENE-2562
- Mark
On Mar 15, 2013, at 5:58 AM, Eric Charles wrote:
> For the record, I happily use Luke (with Lucene 4.1)
April 2013, Apache Lucene™ 4.2.1 available
The Lucene PMC is pleased to announce the release of Apache Lucene 4.2.1.
Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-te
If you haven't heard, there is a Lucene/Solr meetup in New York next
week: http://www.meetup.com/NYC-Apache-Lucene-Solr-Meetup/calendar/13325754/
The scheduled talks are (in addition to lightening talks):
Solr 1.5 and Beyond:
Yonik Seeley, author of Solr, co-founder, Lucid Imagination Topics w
On 6/1/10 9:34 AM, Mindaugas Žakšauskas wrote:
It's just an early
observation as historically Lucene has been doing an amazing job in
terms of API stability.
Yes it has :)
Get ready for even more change in that area though :)
--
- Mark
http://www.lucidimagination.com
---
Hey all - apologize for the quick cross post - just to let you know,
Andrzej is giving a free webinar this wed. His presentations are always
fantastic, so check it out:
Lucid Imagination Presents a free technical webinar: Mastering the
Lucene Index
Wednesday, August 11, 2010 11:00 AM PST / 2:00 P
er if you do.
FVH: works with fewer query types and requires that you store term vectors -
but scales better than the std Highlighter to very large documents
- Mark Miller
lucidimagination.com
Lucene/Solr User Conference
May 25-26, San Francisco
www.lucenerevolution.org
On Apr 1, 2011, at 8:
T-consistency-tp2801878p2801878.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java
rom the Lucene - Java Users mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
- Mark Mill
- Amazon
Dynamo uses vector clocks for this.
>
> Otis
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message
>> From: Mark Miller
>> To: java-user@lucene.
icitly set it higher than 0 for now.
Feel free to create a JIRA issue and we can give it's own default greater than
0.
- Mark Miller
lucidimagination.com
On Jul 6, 2011, at 5:34 PM, Jahangir Anwari wrote:
> I have a CustomHighlighter that extends the SolrHighlighter and overrides
&
sp);
} else if (query instanceof TermQuery) {
- extractWeightedTerms(terms, query);
+ extractWeightedSpanTerms(terms, new
SpanTermQuery(((TermQuery)query).getTerm()));
} else if (query instanceof SpanQuery) {
extractWeightedSpanTerms(terms, (SpanQuery) query);
On Jul 8, 2011, at 5:43 AM, Jahangir Anwari wrote:
> I don't think this is the best
> solution, am open to other alternatives.
Could also make it static public where it is? Either way.
- Mark Miller
lucidimag
e Lucene EuroCon 2011 is presented by Lucid Imagination, the commercial
entity for Apache Solr/Lucene Open Source Search; proceeds of the conference
benefit The Apache Software Foundation.
"Lucene" and "Apache Solr" are trademarks of the Apache Software Foundation.
- Mark
My advice: Don't close the IndexWriter - just call commit. Don't worry about
forcing merges - let them happen as they do when you call commit.
If you are going to use the IndexWriter again, you generally do not want to
close it. Calling commit is the preferred option.
- M
nd I
think the limitation that I ate was that the word could belong to both it's
true sentence, and the one after it.
- Mark Miller
lucidimagination.com
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache
On Jul 20, 2011, at 7:44 PM, Mark Miller wrote:
>
> On Jul 20, 2011, at 11:27 AM, Peter Keegan wrote:
>
>> Mark Miller's 'SpanWithinQuery' patch
>> seems to have the same issue.
>
> If I remember right (It's been more the a couple years),
.length, 1);
>
> clauses[1] = makeSpanTermQuery("3");
> allKeywords = new SpanNearQuery(clauses, Integer.MAX_VALUE, false); //
> SpanAndQuery equivalent
> query = new SpanWithinQuery(allKeywords, endSentence, 0);
> System.out.println("query: "+query);
> hits =
t there.
>
> Peter
>
> On Thu, Jul 21, 2011 at 3:07 PM, Mark Miller wrote:
>
>> Hey Peter,
>>
>> Getting sucked back into Spans...
>>
>> That test should pass now - I uploaded a new patch to
>> https://issues.apache.org/jira/browse/LUCENE-777
I just uploaded a patch for 3X that will work for 3.2.
On Jul 21, 2011, at 4:25 PM, Mark Miller wrote:
> Yeah, it's off trunk - I'll submit a 3X patch in a bit - just have to change
> that to an IndexReader I believe.
>
> - Mark
>
> On Jul 21, 2011, at 4:01 PM, Pe
Thanks Peter - if you supply the unit tests, I'm happy to work on the fixes.
I can likely look at this later today.
- Mark Miller
lucidimagination.com
On Jul 25, 2011, at 10:14 AM, Peter Keegan wrote:
> Hi Mark,
>
> Sorry to bug you again, but there's another case that
y use even more tests before feeling too confident here…
I've attached a patch for 3X with the new test and fix (changed that include
back to exclude).
- Mark Miller
lucidimagination.com
On Jul 25, 2011, at 10:29 AM, Mark Miller wrote:
> Thanks Peter - if you supply the unit tests, I'
case tests like I likely
should try if I was going to commit this thing.
- Mark Miller
lucidimagination.com
On Jul 26, 2011, at 8:56 AM, Peter Keegan wrote:
> Thanks Mark! The new patch is working fine with the tests and a few more. If
> you have particular test cases in mind, I'd
On Jul 26, 2011, at 9:52 AM, Clemens Wyss wrote:
> Side note: I am using threads when writing and theses threads are (by design)
> interrupted (from time to time)
Perhaps you are seeing this: https://issues.apache.org/jira/browse/LUCENE-2239
- Mark Miller
lucidimaginati
> we should correct the javadocs for expungeDeletes here I think: so
> that its more consistent with the javadocs for optimize?
>
> "Requests an expunge operation..." ?
>
+1 - it's a documentation bug now.
- Mark Miller
lu
The XML query parser can map to Lucene one to one as well - hasn't seemed
to pick up enough steam to be included with Solr yet, but there has been
some commotion so it's likely to go in at some point. Not enough demand yet
I guess. https://issues.apache.org/jira/browse/SOLR-839 XML Query Parser
Sup
Have you considered storing your indexes server-side? I haven't used
compression but usually the trade-off of compression is CPU usage which
will also be a drain on battery life. Or maybe consider how important the
highlighter is to your users - is it worth the trade-off of either disk
space or bat
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
October 2013, Apache Lucene™ 4.5.1 available
The Lucene PMC is pleased to announce the release of Apache Lucene 4.5.1
Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable
case, please try another mirror. This also goes for Maven access.
Happy Holidays,
Mark Miller
http://www.about.me/markrmiller
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java
Nature abhors being anything but an author by name on a second tech book.
The ruse is up after one when you have the inputs crystalized and the
hourly wage in hand. Hard to find anything but executive producers after
that. I’d shoot for a persuasive crowdfunding attempt.
Dino Korah wrote:
Hi All,
If I am to completely avoid the query parser and use the BooleanQuery along
with TermQuery, RangeQuery, PrefixQuery, PhraseQuery, etc, does the search
words still get to the Analyzer, before actually doing the real search?
Many thanks,
Dino
Answer: no
The Q
Andy Goodell wrote:
I thought I understood phrases and slop until one of my coworkers
brought by the following example
For a document that contains
"quick brown fox"
"quick brown fox"~0
"quick fox brown"~2
"fox quick brown"~3
all match.
I would have expected "fox quick brown" to require a 4 i
Andre Rubin wrote:
Hi all,
Most of our queries are very simple, of the type:
Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix));
Hits hits = searcher.search(query, new Sort(new SortField(LABEL_FIELD)))
You might want to check out solrs ConstantScorePrefixQuery and compare
performa
Andre Rubin wrote:
On Tue, Sep 2, 2008 at 10:16 AM, Mark Miller <[EMAIL PROTECTED]> wrote:
Andre Rubin wrote:
Hi all,
Most of our queries are very simple, of the type:
Query query = new PrefixQuery(new Term(LABEL_FIELD, prefix));
Hits hits = searcher.search(query, new So
You should really close the IndexSearcher rather than the directory.
Andy33 wrote:
I have a memory leak in my lucene search code. I am able to run a few queries
fine, but I eventually run out of memory. Please note that I do close and
set to null the ivIndexSearcher object elsewhere. Here is the
Sounds like its more in line with what you are looking for. If I
remember correctly, the phrase query factors in the edit distance in
scoring, but the NearSpanQuery will just use the combined idf for each
of the terms in it, so distance shouldnt matter with spans (I'm sure
Paul will correct me
Paul Elschot wrote:
Op Thursday 04 September 2008 20:39:13 schreef Mark Miller:
Sounds like its more in line with what you are looking for. If I
remember correctly, the phrase query factors in the edit distance in
scoring, but the NearSpanQuery will just use the combined idf for
each of the
SpanScorer will use the similarity slop factor for each matching
span size to adjust the effective frequency.
Regards,
Paul Elschot
You have pointed this out to me before. One day I will remember
Every time I look things over again I miss it, and I couldn't find that
email in the archive
You might check out the tagindex issue in jira as well. Havn't looked at
it myself, but I believe its supposed to be an option for this.
Gerardo Segura wrote:
I think the important question is: in general how to cope with
frequently changing fields.
Karl Wettin wrote:
Hi Wojciech,
can you
[EMAIL PROTECTED] wrote:
Hello
Is it possible to exclude numbers using StandardAnalyzer just like
SimpleAnalyzer?
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Its possible bu
e a token filter?
>
> On Mon, Sep 22, 2008 at 8:36 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
>
>
>> [EMAIL PROTECTED] wrote:
>>
>>
>>> Hello
>>>
>>> Is it possible to e
simon litwan wrote:
hi all
i tried to reuse the IndexSearcher among all of the threads that are
doing searches as described in
(http://wiki.apache.org/lucene-java/LuceneFAQ#head-48921635adf2c968f7936dc07d51dfb40d638b82)
this works fine. but our application does continuous indexing. so the
Right, just don't share the same instance across threads.
- Mark
On Oct 18, 2008, at 3:11 PM, "Rafael Almeida" <[EMAIL PROTECTED]>
wrote:
On queryparser's documentation says:
"Note that QueryParser is not thread-safe."
it only means that the same instance of QueryParser can't be used by
mu
Richard Marr wrote:
Hi all,
Is there a mailing-list-appropriate way to hire coders with Lucene
experience? I don't want to just spam the list because I don't want to
crap where I live. I'm a programmer not a recruiter if that makes any
difference.
Cheers,
Rich
It sounds like you might have some thread synchronization issues outside
of Lucene. To simplify things a bit, you might try just using one
IndexWriter. If I remember right, the IndexWriter is now pretty
efficient, and there isn't much need to index to smaller indexes and
then merge. There is a
Glen Newton wrote:
2008/10/23 Mark Miller <[EMAIL PROTECTED]>:
It sounds like you might have some thread synchronization issues outside of
Lucene. To simplify things a bit, you might try just using one IndexWriter.
If I remember right, the IndexWriter is now pretty efficient, and there
Just change it. Merges will start obeying the new merge factor
seamlessly.
- Mark
On Oct 27, 2008, at 1:07 PM, Tom Saulpaugh <[EMAIL PROTECTED]>
wrote:
Hello,
We are currently using lucene v2.1 and we are planning to upgrade to
lucene v2.4.
Can we change the merge factor for an existi
How many fields are you sorting on? Lots of unuiqe terms in those
fields?
- Mark
On Oct 29, 2008, at 6:03 PM, "Todd Benge" <[EMAIL PROTECTED]> wrote:
Hi,
I'm the lead engineer for search on a large website using lucene for
search.
We're indexing about 300M documents in ~ 100 indices.
The term, terminfo, indexreader internals stuff is prob on the low end
compared to the size of your field caches (needed for sorting). If you
are sorting by String I think the space needed is 32 bits x number of
docs + an array to hold all of the unique terms. So checking 300 million
docs (I kn
adoop, nutch, solr &
terracotta for possibilities such as index sharding.
Has anyone implemented a solution using hadoop or terracotta for a
large scale system? Just wondering the pro's / con's of the various
approaches.
Thanks,
Todd
On Wed, Oct 29, 2008 at 6:07 PM, Mark Miller &l
John G wrote:
I have an index with a particular document marked as deleted. If I use the
search method that returns TopDocs and that deleted document satisfies the
search criteria, will it be included in the returned TopDocs object even
though it has been marked as deleted?
Thanks in advance.
J
20 fields on a huge index? Wow - not sure there is a ton you can do with
that...anyone have any suggestions for that one? Distributed should help
I suppose, but thats a lot of sort fields for a large index.
If LUCENE-831 ever gets off the ground you will be able to change the
cache used, and p
Am I missing your benchmark algorithm somewhere? We need it. Something
doesn't make sense.
- Mark
Justus Pendleton wrote:
Howdy,
I have a couple of questions regarding some Lucene benchmarking and
what the results mean[3]. (Skip to the numbered list at the end if you
don't want to read the
will ensure you are reusing the same reader for each search. Hope
to analyze further soon.
- Mark
Justus Pendleton wrote:
On 03/11/2008, at 11:07 PM, Mark Miller wrote:
Am I missing your benchmark algorithm somewhere? We need it.
Something doesn't make sense.
I thought I had includ
Or nabble or markmail
- Mark
On Nov 7, 2008, at 3:33 PM, Dragon Fly <[EMAIL PROTECTED]>
wrote:
http://www.gossamer-threads.com/lists/lucene/java-user/
Date: Fri, 7 Nov 2008 14:27:38 -0700
From: [EMAIL PROTECTED]
To: java-user@lucene.apache.org
Subject: searchable archives
Hey,
Is thi
Not out of the box, but it's fairly trivial to copy multisesscher and
modify it so that a different query goes to each suvsearcher.
- Mark
On Nov 8, 2008, at 5:45 AM, "Shishir Jain" <[EMAIL PROTECTED]>
wrote:
Hi,
Doc1: Field1, Field2
Doc2: Field1, Field2
If I create Index such that Fie
people are interested in rather than all matching docs.
Sorry for the confusion there - need to double check what I write...
Mark Miller wrote:
Their is definitely some stale javadoc in Lucene here and there. All
of what your talking about has been shaken up recently with the
deprecation of Hits
Check out the SpanScorer.
- Mark
On Nov 10, 2008, at 8:25 AM, "Sertic Mirko, Bedag" <[EMAIL PROTECTED]
> wrote:
[EMAIL PROTECTED]
I am searching for a solution to make the Highlighter run property in
combination with phrase queries.
I want to highlight text with a phrase query like "w
Michael McCandless wrote:
But: it's slow to load a field for the first time. LUCENE-1231
(column-stride fields) aims to greatly speed up the load time.
Test it out though. In some recent testing I was doing it was *way*
faster than I thought it would be based on what I had been reading. Of
c
, it works just
like the non phrase/span aware Highlighter.
- Mark
Sertic Mirko, Bedag wrote:
Hi
Thank you for your response.
Are there examples available?
Regards
Mirko
-Ursprüngliche Nachricht-
Von: Mark Miller [mailto:[EMAIL PROTECTED]
Gesendet: Montag, 10. November 2008 14:45
-
Von: Mark Miller [mailto:[EMAIL PROTECTED]
Gesendet: Montag, 10. November 2008 15:38
An: java-user@lucene.apache.org
Betreff: Re: AW: Highlighter and Phrase Queries
Check out the unit tests for the highlighter and there are a bunch of
examples.
Its pretty much the same as using the standard
Their is definitely some stale javadoc in Lucene here and there. All of
what your talking about has been shaken up recently with the deprecation
of Hits. Hits used to pretty much be considered the non-expert API, but
its been tossed in favor of the TopDoc API's.
The HitCollector stuff has been
Nice! An 8 core machine with a test ready to go!
How about trying the read only mode that was added to 2.4 on your
IndexReader?
And if you you are on unix and could try trunk and use the new
NIOFSDirectory implementation...that would be awesome.
Those two additions are our current hope for
And if you you are on unix and could try trunk and use the new
NIOFSDirectory implementation...that would be awesome.
Woah...that made 2.4 too. A 2.4 release will allow both optimizations.
Many thanks!
-
To unsubscribe, e-m
an FSDirectory.
Thats a good point, and points out a bug in solr trunk for me. Frankly I
don't see how its done. There is no code I can see/find to use it rather
than FSDirectory. Still assuming there must be a way, but I don't see it...
- Mark
Any ideas?
Cheers,
Dmitri
On Tue, Nov
Mark Miller wrote:
Thats a good point, and points out a bug in solr trunk for me. Frankly
I don't see how its done. There is no code I can see/find to use it
rather than FSDirectory. Still assuming there must be a way, but I
don't see it...
Ah - brain freeze. What else is new :) Y
r? Or something?
Mike
Mark Miller wrote:
Mark Miller wrote:
Thats a good point, and points out a bug in solr trunk for me.
Frankly I don't see how its done. There is no code I can see/find
to use it rather than FSDirectory. Still assuming there must be a
way, but I don't see it
I'm thinking about it, so if someone else doesn't get something together
before I have some free time...
Its just not clear to me at the moment how best to do it.
Michael McCandless wrote:
Any takers for pulling a patch together...?
Mike
Mark Miller wrote:
+1
- Mark
On No
If your new to Lucene, this might be a little much (and maybe I am not
fully understand the problem), but you might try:
Add the attributes to the words in a payload with a PayloadAnalyzer. Do
searching as normal. Use the new PayloadSpanUtil class to get the
payloads for the matching words. (T
the class fully
working. That said, if it can give me serious speed improvements it's
definitely worth considering.
- Greg
On Wed, Nov 12, 2008 at 12:01 PM, Mark Miller <[EMAIL PROTECTED]> wrote:
If your new to Lucene, this might be a little much (and maybe I am not
fully understand
10 at a time. Depends on
your usecase if its feasible or not though. Most find it efficient
enough to do highlighting with, so I'm assuming it should be good enough
here.
Thanks again for your help on this one.
- Greg
On Wed, Nov 12, 2008 at 12:52 PM, Mark Miller <[EMAIL PROTECTED]> w
Its hard to predict the future of LUCENE-831. I would bet that it will
end up in Lucene at some point in one form or another, but its hard to
say if that form will be whats in the available patches (I'm a contrib
committer so I won't have any real say in that, so take that prediction
with a gra
Like I said, its pretty easy to add this, but its also going to suck.
Kind of exposes the fact that its missing the right extensibility at the
moment. Things are still a bit ugly overall.
Your going to need new CacheKeys for the data types you want to support.
A CacheKey builds and provides a
Check out the docs at:
http://lucene.apache.org/java/2_4_0/api/contrib-instantiated/index.html
There is a performance graph there to check out.
The code should be fairly straightforward - you can make an
InstantiatedIndex thats empty, or seed it with an IndexReader. Then you
can make an Inst
tedIndex(reader)
ireader = iindex.indexReaderFactory()
isearcher = IndexSearcher(ireader)
Kind of round about way to get an InstantiatedIndex I guess,but maybe
there's a briefer way?
Thank you.
Darren
On Sun, 2008-11-16 at 10:50 -0500, Mark Miller wrote:
Check out the docs at:
http://lu
excitingComm2 wrote:
Hi everybody,
as far as I know the lucene score is an arbitrary number between 0.0 and
1.0.
Is this correct, that the scores in my resultset are always normalised to
this spread or is it possible to get higher scores?
Regards,
John W.
Hits is the class that did the norma
Yeah, discussion came up on order and I believe we punted - its up to
you to track order and sort at the moment. I think that was to prevent
those that didnt need it from paying the sort cost, but I have to go
find that discussion again (maybe its in the issue?) I'll look at the
whole idea agai
There is not much impact as long as you turn off Norms for the
majority of them.
- Mark
On Dec 2, 2008, at 8:47 AM, Darren Govoni <[EMAIL PROTECTED]> wrote:
Hi,
I saw this question asked before without a clear answer. Pardons if I
missed it in the archive elsewhere.
Is there a serious deg
Careful here. Not only do you need to pass -server, but you need the
ability to use it :) It will silently not work if its not there I
believe. Oddly, the JRE doesn't seem to come with the server hotspot
implementation. The JDK always does appear to. Probably varies by OS to
some degree.
Some
Sounds familiar. This may actually be in JIRA already.
- Mark
On Dec 3, 2008, at 6:25 PM, "Teruhiko Kurosaka" <[EMAIL PROTECTED]>
wrote:
Mike,
You are right. There was an error on my part. I think
I was, in effect, making a SpanNearQuery object of:
new SpanNearQuery(new SpanQuery[0], 0,
Chris Bamford wrote:
So does that mean if you don't explicitly open an IndexReader, the
IndexSearcher will do it for you? Or what?
Right. The IndexReader takes a Directory, and the IndexSearcher takes an
IndexReader - there are sugar constructors though - An IndexSearcher
will also accept
Ian Vink wrote:
Is there a way to get phrases counted in the list of fragments that come
back from Highlighter.GetBestFragments() in general.
It seems to only take words into account.
Ian
Not sure I fully understand, but have you tried the SpanScorer? It
allows the Highlighter to work with
any of my own
logic... Is there a suitable subclass I can use? The documented ones
- FilterIndexReader, InstantiatedIndexReader, MultiReader,
ParallelReader - all seem too complicated for what I need. My only
requirement is to open it read-only!
Am I missing something?
Mark Miller wrote
ory that takes a String, make a Directory, and use it to make an
IndexReader that you build the IndexSearcher with. If its using a
Directory, use that directory to make the IndexReader that is used for
you IndexSearcher.
Thanks for your continued help with this :-)
Chris
Mark Miller wrote:
L
http://issues.apache.org/jira/browse/LUCENE-522
note the bugs mentioned at the bottom.
- Mark
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
Paul Libbrecht wrote:
Hello again list,
has anyone tried to port or simply run the QueryParser of Lucene to GWT?
It would look like a very nice thing to do to provide direct rendering
of the query interpretation (it could be made into a whole editor
probably, e.g. removing or selecting parts
Drops positions as well.
- Mark
On Dec 18, 2008, at 4:57 PM, "John Wang" wrote:
Hi:
In lucene 2.4, when Field.omitTF() is called, payload is disabled as
well. Is this intentional? My understanding is payload is
independent from
the term frequencies.
Thanks
-John
for
this field.
*/
void setOmitTf(boolean omitTf);
- Mark
John Wang wrote:
Thanks Mark!I don't think it is documented (at least the ones I've read),
should this be considered as a bug or ... ?
Thanks
-John
On Thu, Dec 18, 2008 at 2:05 PM, Mark Miller wrote:
Drops positi
Well look at the issues and see for yourself :)
Its a subjective call I think. Heres my take:
There are not going to be too many sweeping changes in the next release.
There are tons of little bug fixes and improvements, but not a lot of
the bullet point type stuff that you mention in your wish
Mark Miller wrote:
TrieRangeQuery has been added to contrib. Super awesome, super
efficient, large scale sorting.
Sorry. Its way past my bedtime. Large scale numerical range searching.
Sorting on the brain.
-
To
e. My understanding is certainly less than yours though :)
- Mark
Michael McCandless wrote:
The new extensible TokenStream API (based on AttributeSource) is also
in 2.9.
Mike
Mark Miller wrote:
Well look at the issues and see for yourself :)
Its a subjective call I think. Heres my take:
Ther
Lebiram wrote:
Also, what are norms
Norms are a byte value per field stored in the index that is factored
into the score. Its used for length normalization (shorter documents =
more important) and index time boosting. If you want either of those,
you need norms. When norms are loaded up into a
Mark Miller wrote:
Lebiram wrote:
Also, what are norms
Norms are a byte value per field stored in the index that is factored
into the score. Its used for length normalization (shorter documents =
more important) and index time boosting. If you want either of those,
you need norms. When norms
n norms data in scoring somehow?
I'm just stumped as to how Luke is able to do a seach (with limit) on the docs
but in my code it just dies with OutOfMemory errors.
How does Luke not allocate these norms?
________
From: Mark Miller
To: java-user@lucene.apac
Erick Erickson wrote:
> The number of documents
> is irrelevant here, what is relevant is the number of
> distinct terms in your "fieldName" field.
>
Depending on the size of your index, the number of docs will matter
though. You have to store the unique terms in a String[] array, but you
also s
Welcome Patrick!
+1 for LocalLucene.
patrick o'leary wrote:
Thanks Folks
I'm in the business well over a decade now; Started my career in my country
of origin in Ireland, and have since lived & worked in UK and the US. I've
also traveled extensively establishing development groups in remote of
Okay, Koji, hopefully I'll be more luckily suggesting this this time.
Have you tried http://issues.apache.org/jira/browse/LUCENE-1448 yet? I am
not sure if its in an applyable state, but I hope that covers your issue.
On Fri, Jan 16, 2009 at 7:15 PM, Koji Sekiguchi wrote:
> Hello,
>
> I'm writi
Group-by in Lucene/Solr has not been solved in a great general way yet
to my knowledge.
Ideally, we would want a solution that does not need to fit into memory.
However, you need the value of the field for each document. to do the
grouping As you are finding, this is not cheap to get. Currentl
1 - 100 of 691 matches
Mail list logo