On Thu, Oct 12, 2017 at 8:53 AM, Chellasamy G wrote:
> Could anyone please explain the difference between SortedDocValues and
> SortedSetDocValues.
SortedDocValues has at most 1 value per document (single-valued).
SortedSetDocValues supports a set of values per
On Thu, Aug 4, 2016 at 9:35 AM, Michael McCandless
wrote:
> Lucene's merging is concurrent, but Solr unfortunately uses
> UninvertingReader on each DBQ ... I'm not sure why.
It looks like DeleteByQueryWrapper was added by
Use getSortedDocValues for a single-valued field, or
getSortedSetDocValues for multi-valued.
-Yonik
On Fri, Jul 8, 2016 at 12:29 PM, paule_lecuyer wrote:
> Many Thanks Yonik, I will try that.
>
> For my understanding, what is the difference between SortedSetDocValues
>
Use the docValues interface by calling getSortedSetDocValues on the
leaf reader. That will either
1) use real docValues if you have indexed them
2) use the FieldCache to uninvert an indexed field and make it look
like docValues.
-Yonik
On Thu, Jul 7, 2016 at 1:33 PM, paule_lecuyer
On Sun, Sep 13, 2015 at 4:23 PM, Selva Kumar
wrote:
> Mutable, "Immutable" interface of BitSet seems to be defined based on
> specific things like live docs and documents with DocValue etc. Any plan to
> add general purpose readonly interface to BitSet?
We already
t; Similarly, BitSet
> has many more write methods compared to MutableBits. So, as I said, this
> seems to be based on internal requirement like live docs, documents with
> DocValues etc.
>
> Thanks for your time, Yonik
>
>
> On Sun, Sep 13, 2015 at 4:43 PM, Yonik Seeley &l
Yes, if you do a commit with waitSearcher=true (and it succeeds) then
any adds before that point will be visible.
-Yonik
On Mon, Jul 20, 2015 at 8:25 PM, Bhawna Asnani bhawna.asn...@gmail.com wrote:
Hi,
I am using solr to update a document and read it back immediately through
search.
I do
Hey Folks,
If you're interested in going to Lucene/Solr Revolution this year in Austin,
please vote for the sessions you would like to see!
https://lucenerevolution.uservoice.com/
-Yonik
-
To unsubscribe, e-mail:
For queries with many terms, where each term matches few documents
(actually a single document for ID filters in my tests), I saw
speedups between 4x and 8x
http://heliosearch.org/solr-terms-query/ (the 3rd chart)
-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets,
On Thu, Mar 6, 2014 at 6:28 PM, Furkan KAMACI furkankam...@gmail.com wrote:
Hi;
Tf-Idf is explanation says that:
*idf(t)* appears for *t* in both the query and the document, hence it is
squared in the equation.
DefaultSimilarity does not square it. What it the explanation of it?
I think
On Mon, Oct 14, 2013 at 9:43 PM, Darren Hoffman dar...@jnamics.com wrote:
Can anyone tell me if a search based on a ConstantScoreQuery should return
the results in the order that the documents were added to the index?
The order will be internal docid, which used to be the order that docs
were
On Wed, Jul 31, 2013 at 2:51 PM, Nicolas Guyot sfni...@gmail.com wrote:
I have written a quick test to reproduce the slower sorting with numeric DV.
In this test case, it happens only when reverse sorting.
Right - I bet your numeric field is relatively ordered in the index.
When this happens,
On Mon, Dec 17, 2012 at 12:58 AM, lukai lukai1...@gmail.com wrote:
Hi, guys:
Does queryplugin implementation impacts caching? I have implemented a new
query parser which just take the input query string and return my own query
object. But the problem is, when i apply this logic to solr, it
On Wed, Jul 11, 2012 at 9:34 AM, Jamie ja...@stimulussoft.com wrote:
I am busying attempting to integrate Lucene 4.0 Alpha into my code base. I
have a custom QueryParser that extends QueryParser and overrides
newRangeQuery and newTermQuery
Random pointer: for most special case field handling,
On Fri, May 25, 2012 at 5:23 AM, Nikolay Zamosenchuk
nikolaz...@gmail.com wrote:
IndexWriter.deleteDocument(..) is not final,
but doesn't return any result.
Deleted terms are buffered for good performance, so at the time of
IndexWriter.deleteDocument(Term) we don't know how many documents
match
On Mon, Mar 5, 2012 at 1:53 PM, Benson Margulies bimargul...@gmail.com wrote:
There's no javadoc on here yet, and I am a little puzzled by the fact
that it is returning null for me. Does that imply that there can't be
any deleted docs known to the reader?
Right, see AtomicReader
/** Returns
On Sat, Dec 31, 2011 at 11:52 AM, Lance Java lance.j...@googlemail.com wrote:
Hi, I am new to Lucene and I am trying to use spatial search.
The old tier-based stuff in Lucene is broken and considered deprecated.
For Lucene, this may currently be your best hope:
On Thu, Nov 17, 2011 at 2:53 PM, Simon Willnauer
simon.willna...@googlemail.com wrote:
dude, look at this query... its insane isn't it :)
Sorry... what's the equivalent you'd like instead?
Or if you're just unjustifiably bitching about Solr again, maybe I
should take a stroll through Lucene land
On Thu, Nov 17, 2011 at 3:18 PM, Uwe Schindler u...@thetaphi.de wrote:
Sorry, this query is really ununderstandable. Those complex queries should
have a meaningful language, e.g. a JSON object structure
There are upsides and downsides to that. A big JSON object graph
would be easier to *read*
On Thu, Nov 17, 2011 at 3:40 PM, Mark Harwood markharw...@yahoo.co.uk wrote:
JSON or XML can reflect more closely the hierarchy in the underlying Lucene
query objects.
We normally use the Lucene QueryParser syntax itself for that (not
HTTP parameters).
Other parameters such as filters,
On Thu, Nov 17, 2011 at 3:44 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Maybe someone can post the equivalent query in ElasticSearch?
I don't think it's possible. Hoss threw in the kitchen sink into his
contrived' example.
Here's a super simple example:
JSON:
{
sort : [
On Wed, Nov 16, 2011 at 10:36 AM, Shashi Kant sk...@sloan.mit.edu wrote:
I had posted this earlier on this list, hope this provides some answers
http://engineering.socialcast.com/2011/05/realtime-search-solr-vs-elasticsearch/
Except it's an out of date comparison.
We have NRT (near real time
On Fri, May 20, 2011 at 2:46 PM, Doron Cohen cdor...@gmail.com wrote:
I stumbled upon the 'Explain' function yesterday though it returns a crowded
message using debug in SOLR admin. Is there another method or interface
which returns more or cleaner info?
I am not familiar with the use of
On Tue, Apr 5, 2011 at 10:06 AM, Shai Erera ser...@gmail.com wrote:
Can we use TermEnum to skip to the first term 'after 3 weeks'? If so, we can
pull the first doc that appears in the TermDocs of that Term (if it's a
valid term).
Yep. Try this to get the term you want to use to seek:
On Tue, Apr 5, 2011 at 2:24 AM, Antony Bowesman a...@thorntothehorn.org wrote:
Seems like SortedVIntList can be used to store the info, but it has no
methods to build the list in the first place, requiring an array or bitset
in the constructor.
It has a constructor that takes DocIdSetIterator
Solr has a hyphenated word filter you could copy.
http://lucene.apache.org/solr/api/org/apache/solr/analysis/HyphenatedWordsFilterFactory.html
On trunk, this has been folded into the analysis module.
-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco
On Sun, Feb 27, 2011 at 2:15 PM, Bernd Fehling
bernd.fehl...@uni-bielefeld.de wrote:
Jepp, its back online.
Just did a short test and reported my results to jira, but is the
error from the xml output still a jetty problem or is it from XMLwriter?
The patch has been committed, so you should
On Fri, Feb 25, 2011 at 8:48 AM, Bernd Fehling
bernd.fehl...@uni-bielefeld.de wrote:
So Solr trunk should already handle Unicode above BMP for field type string?
Strange...
One issue is that jetty doesn't support UTF-8 beyond the BMP:
/opt/code/lusolr/solr/example/exampledocs$ ./test_utf8.sh
know how to add a char
above the BMP to utf8-example.xml?
-Yonik
http://lucidimagination.com
Regards,
Bernd
Am 25.02.2011 14:54, schrieb Yonik Seeley:
On Fri, Feb 25, 2011 at 8:48 AM, Bernd Fehling
bernd.fehl...@uni-bielefeld.de wrote:
So Solr trunk should already handle Unicode above BMP
That's exactly what the CSF feature is for, right? (docvalues branch)
-Yonik
http://lucidimagination.com
On Wed, Feb 2, 2011 at 1:03 PM, Jason Rutherglen jason.rutherg...@gmail.com
wrote:
I'm curious if there's a new way (using flex or term states) to store
IDs alongside a document and
On Wed, Feb 2, 2011 at 9:23 PM, Jason Rutherglen jason.rutherg...@gmail.com
wrote:
Is it? I thought it would load the values into heap RAM like the
field cache and in addition save the values to disk? Does it also
read the values directly from disk?
Loading into memory is a separate
On Fri, Dec 17, 2010 at 11:18 AM, Michael McCandless
luc...@mikemccandless.com wrote:
If you are using Lucene's trunk (nightly build) release, read on...
I just committed a change (for LUCENE-2811) that changes the index
format on trunk, thus breaking (w/ likely strange exceptions on
reading
On Mon, Dec 13, 2010 at 2:51 PM, Robert Muir rcm...@gmail.com wrote:
On Mon, Dec 13, 2010 at 2:43 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
On Mon, Dec 13, 2010 at 2:10 PM, Brian Hurt bhur...@gmail.com wrote:
I was just wondering what the logic was for defaulting to or instead
On Mon, Dec 13, 2010 at 3:07 PM, Robert Muir rcm...@gmail.com wrote:
On Mon, Dec 13, 2010 at 3:04 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
I think of the Lucene QueryParser like SQL. SQL is text based and also
meant for human entered text - but for either very expert users
We're holding a free webinar about relevancy enhancements in our
commercial version of Solr. Details below.
-Yonik
http://www.lucidimagination.com
-
Join us for a free technical webcast
Better Search Results Faster with
On Mon, Nov 22, 2010 at 12:49 PM, Uwe Schindler u...@thetaphi.de wrote:
(Fuzzy scores on
MultiSearcher and Solr are totally wrong because each shard uses another
rewritten query).
Hmmm, really? I thought that fuzzy scoring should just rely on edit distance?
Oh wait, I think I see - it's
On Mon, Nov 22, 2010 at 12:17 PM, Uwe Schindler u...@thetaphi.de wrote:
The latest discussion was more about MultiReader vs. MultiSearcher.
But you are right, 1.4 B documents is not easy to go, especially when you
index grows and you get to the 2.1 B marker, then no MultiSearcher or
whatever
On Sun, Nov 21, 2010 at 6:33 PM, Luca Rondanini
luca.rondan...@gmail.com wrote:
Hi everybody,
I really need some good advice! I need to index in lucene something like 1.4
billions documents. I had experience in lucene but I've never worked with
such a big number of documents. Also this is
On Fri, Nov 19, 2010 at 5:41 PM, Mark Kristensson
mark.kristens...@smartsheet.com wrote:
Here's the changes I made to org.apache.lucene.util.StringHelper:
//public static StringInterner interner = new SimpleStringInterner(1024,8);
As Mike said, the real fix for trunk is to get rid of
We're holding a free webinar on migration from FAST to Solr. Details below.
-Yonik
http://www.lucidimagination.com
=
Solr To The Rescue: Successful Migration From FAST ESP to Open Source
Search Based on Apache Solr
It turns out that the prepareCommit() is the slow call here, taking several
seconds to complete.
I've done some reading about it, but have not found anything that might be
helpful here. The fact that it is slow
every single time, even when I'm adding exactly one document to the index, is
On Fri, Oct 29, 2010 at 3:32 PM, Cabansag, Ronald-Alvin R
ronald-alvin.caban...@cengage.com wrote:
We use a QueryWrapperFilter.getDocIdSet(indexReader) to get the DocIdSet and
compute the hit count using its iterator.
If you want to avoid double-caching of norms, then you should call
On Mon, Oct 25, 2010 at 7:00 PM, Dennis Kubes ku...@apache.org wrote:
A curiosity. Some of the documentation for function queries says they match
every document in the index. When running a query that has boolean required
clauses and an optional ValueSourceQuery or function query is the
On Tue, Sep 21, 2010 at 12:53 AM, Lance Norskog goks...@gmail.com wrote:
If an index file is not completely written to disk, it never become
available. Lucene has a file describing the current active index segments.
It writes all new files to the disk, and changes the description file
This is working as designed.
Note this method:
public DocIdSet getDocIdSet(IndexReader indexReader) throws IOException {
return openBitSet;
}
You must pay attention to the IndexReader passed - and the DocIdSet
returned must always be based on that reader (and the first document
of
On Mon, Jul 19, 2010 at 6:14 AM, Naveen Kumar id.n...@gmail.com wrote:
Is there any API using which I can retrieve search results, such that they
are neither scored nor sorted (for performance reasons). I just need the
results, don't need any extra computation on that.
Use your own custom
On Mon, Jul 19, 2010 at 9:53 AM, Philippe mailer.tho...@gmail.com wrote:
is there a possibility to retrieve the lengthNorm for all (or a specific)
fields in a specific document?
See IndexReader: public abstract byte[] norms(String field) throws IOException;
And Similarity: public float
Yes, all of that still applies to Lucene 3x and 4x, and is unlikely to
change any time soon.
-Yonik
http://www.lucidimagination.com
On Thu, Jun 24, 2010 at 1:51 PM, Zhang, Lisheng
lisheng.zh...@broadvision.com wrote:
Hi,
I remembered I tested earlier lucene 1.4 and 2.4, and found the
On Tue, Jun 15, 2010 at 5:23 AM, Michael McCandless
luc...@mikemccandless.com wrote:
CheckIndex is not able to recover from this corruption (missing
segments_N file); this would be a nice addition...
But it sounds like you've worked out a way to write your own segmetns_N?
Use
On Wed, Jun 2, 2010 at 1:10 PM, jan.kure...@nokia.com wrote:
that's probably because I move from lucene to solr.
We will need to filter them from the result manually then first.
Solr has a function range query that can filter out any values outside
of the given range.
On Sun, May 30, 2010 at 1:33 PM, Visual Logic visual.lo...@gmail.com wrote:
JSON is the format used for all the configuration and property files in the
RIA application we are developing. Is Lucene able to create a document from a
given JSON file and index it? Is Lucene able to provide a JSON
On Sun, May 30, 2010 at 2:27 PM, Visual Logic visual.lo...@gmail.com wrote:
Solr is embeddable but does that not just mean that SolrJ only provides the
ability to call Solr running on some server?
Nope - embeddable as in running in the same JVM as your application.
For some of my use cases
It seems like there should be a formula for estimating the total
number of unique terms given that you know the unique term counts for
each segment, and make certain assumptions like random document
distribution across segments.
-Yonik
http://www.lucidimagination.com
On Thu, May 27, 2010 at 9:17
On Thu, May 27, 2010 at 2:32 PM, kannan chandrasekaran
ckanna...@yahoo.com wrote:
I was wondering if there is a way to retrieve the number of unique terms in
the lucene ( version 2.4.0) ... I am aware of the terms() terms(Term)
method that returns an enumeration (TermEnum) but that involves
On Mon, May 17, 2010 at 5:00 PM, Shay Banon kim...@gmail.com wrote:
I wanted to verify if my understanding is correct. Assuming that I use
NRT, and refresh, say, every 1 second, caching based on IndexReader, such is
what is used in the CachingWrapperFilter is basically useless
No, it's fine.
.getSequentialSubReaders() != null) {
System.err.println(Should not be more readers...);
}
}
}
}
indexWriter.close();
}
On Tue, May 18, 2010 at 12:30 AM, Yonik Seeley
yo...@lucidimagination.comwrote:
On Mon, May 17, 2010 at 5:00 PM, Shay Banon
On Mon, May 17, 2010 at 9:00 PM, Shay Banon kim...@gmail.com wrote:
Great, so I am not imagining things this late into the night ... ;), not so
great, since using NRT with field cache (like sorting) or caching filters,
or anything that caches based on IndexReader not really an option. This
On Mon, May 17, 2010 at 9:12 PM, Shay Banon kim...@gmail.com wrote:
Just saw that you opened a case for that. I think that its important in your
test case to also test for object identity, not just equals. This is because
the IndexReader (or the FieldCacheKey) are used as keys in weak hash
looking now at what it does,
its new...
-shay.banon
On Tue, May 18, 2010 at 4:04 AM, Yonik Seeley
yo...@lucidimagination.comwrote:
On Mon, May 17, 2010 at 9:00 PM, Shay Banon kim...@gmail.com wrote:
Great, so I am not imagining things this late into the night ... ;), not
so
great
You are requesting the FieldCache entry from the top-level reader and
hence a whole new FieldCache entry must be created.
Lucene 2.9 sorting requests FieldCache entries at the segment level
and hence reuses entries for those segments that haven't changed.
-Yonik
Apache Lucene Eurocon 2010
18-21
Yes on all counts. Lucene doesn't modify query objects, so they are
save for reuse among multiple threads.
-Yonik
Apache Lucene Eurocon 2010
18-21 May 2010 | Prague
2010/5/10 Mindaugas Žakšauskas min...@gmail.com:
Hi,
Can anybody confirm whether MatchAllDocsQuery can be used as an
2010/5/5 José Ramón Pérez Agüera jose.agu...@gmail.com:
[...]
The consequence is that a document
matching a single query term over several fields could score much
higher than a document matching several query terms in one field only,
One partial workaround that people use is
Forwarding to lucene only - the big cross-post caused my gmail filters
to file it.
-Yonik
-- Forwarded message --
From: Grant Ingersoll gsing...@apache.org
Date: Wed, Mar 24, 2010 at 8:03 PM
Subject: Apache Lucene EuroCon Call For Participation: Prague, Czech
Republic May 20 21,
On Thu, Mar 11, 2010 at 4:10 PM, Peter Keegan peterlkee...@gmail.com wrote:
I want the TFC to do all the cool things it does like custom sorting, saving
the field values, max score, etc. I suppose the custom Collector could
explicitly delegate all TFC's methods, but this doesn't seem right.
No
On Fri, Feb 26, 2010 at 3:33 PM, Ivan Vasilev ivasi...@sirma.bg wrote:
Does it matter precision step when I use NumericRangeQuery for exact
matches?
No. There is a full-precision version of the value indexed regardless
of the precision step, and that's used for an exact match query.
I mean
On Wed, Feb 3, 2010 at 1:40 PM, tsuraan tsur...@gmail.com wrote:
Is there any way to run a search where I provide a Query, a Sort, and
a Collector? I have a case where it is sometimes, but rarely,
necessary to get all the results from a query, but usually I'm
satisfied with a smaller amount.
Perhaps this is just a huge index, and not enough of it can be cached in RAM.
Adding additional clauses to a boolean query incrementally destroys locality.
104GB of index and 4GB of RAM means you're going to be hitting the
disk constantly. You need more hardware - if you're requirements are
low
On Sun, Jan 3, 2010 at 10:42 AM, Karl Wettin karl.wet...@gmail.com wrote:
3 jan 2010 kl. 16.32 skrev Yonik Seeley:
Perhaps this is just a huge index, and not enough of it can be cached in
RAM.
Adding additional clauses to a boolean query incrementally destroys
locality.
104GB of index
On Thu, Nov 19, 2009 at 1:04 AM, Daniel Noll dan...@nuix.com wrote:
I take it the existing numeric fields can't already do stuff like
this?
Nope, it's a fundamental limitation of the current TermEnums.
-Yonik
http://www.lucidimagination.com
On Wed, Nov 18, 2009 at 10:48 PM, Daniel Noll dan...@nuix.com wrote:
But what if I want to find the highest? TermEnum can't step backwards.
I've also wanted to do the same. It's coming with the new flexible
indexing patch:
On Mon, Nov 16, 2009 at 11:38 AM, Jeff Plater
jpla...@healthmarketscience.com wrote:
Thanks - so if my sort field is a single term then I should be ok with
using an analyzer (to lowercase it for example).
Correct - the key is that there is not more than one token per
document for the field
On Mon, Nov 16, 2009 at 1:02 AM, John Wang john.w...@gmail.com wrote:
I did some performance analysis for different ways of doing numeric
ranging with lucene. Thought I'd share:
FYI, the second approach is already implemented in both Lucene and Solr.
On Wed, Nov 11, 2009 at 8:54 AM, Shai Erera ser...@gmail.com wrote:
I index documents with numeric fields using the new Numeric package. I
execute two types of queries: range queries (for example, [1 TO 20}) and
equality queries (for example 24.75). Don't mind the syntax.
Currently, to
On Tue, Nov 10, 2009 at 11:43 AM, Jamie Band ja...@stimulussoft.com wrote:
As an aside note, is there any way for Lucene to support simultaneous writes
to an index?
The indexing process is highly parallelized... just use multiple
threads to add documents to the same IndexWriter.
-Yonik
On Tue, Oct 27, 2009 at 9:07 PM, Luis Alves lafa...@gmail.com wrote:
But there needs to be some forced push for these shorter major release
cycles,
to allow for code clean cycles to also be sorter.
Maybe... or maybe not.
There's also value in a more stable API over a longer period of time.
How many processors do you have on this system?
If you are CPU bound, 100 threads is going to be 10 times slower (at a
minimum) than 10 threads (unless you have more than 10 CPUs).
-Yonik
http://www.lucidimagination.com
On Fri, Oct 23, 2009 at 2:18 AM, Wilson Wu songzi0...@gmail.com wrote:
Dear
2009/10/20 Teruhiko Kurosaka k...@basistech.com:
My Tokenizer started showing an error when I switched
to Solr 1.4 dev version. I am not too confident but
it seems that Solr 1.4 calls close() on my Tokenizer
before calling reset(Reader) in order to reuse
the Tokenizer. That is, close() is
On Tue, Oct 20, 2009 at 5:03 PM, Nathan Howard natehowa...@gmail.com wrote:
This is sort of related to the above question, but I'm trying to update some
(now depricated) Java/Lucene code that I've become aware of once we started
using 2.4.1 (we were previously using 2.3.2):
Hits results =
Hmm, yes, I should have thought of quoting the havadoc :-)
The Hits javadoc has been udpated though... we shouldn't be pushing
people toward collectors unless they really need them:
* TopDocs topDocs = searcher.search(query, numHits);
* ScoreDoc[] hits = topDocs.scoreDocs;
* for (int i =
On Fri, Oct 16, 2009 at 4:54 AM, Jukka Zitting jukka.zitt...@gmail.com wrote:
Hi,
On Fri, Oct 16, 2009 at 10:23 AM, Danil ŢORIN torin...@gmail.com wrote:
What about creating major version more often?
+1 We're not going to run out of version numbers, so I don't see a
reason not to upgrade
Are you using any custom query types? Anything to help us reproduce
(like the acutal query this happened on) would be greatly appreciated.
-Yonik
http://www.lucidimagination.com
On Thu, Oct 15, 2009 at 1:17 PM, Peter Keegan peterlkee...@gmail.com wrote:
I'm using Lucene 2.9 and sometimes get
Guys, please - you're not new at this... this is what JavaDoc is for:
/**
* Returns a readonly reader containing all
* current updates. Flush is called automatically. This
* provides near real-time searching, in that changes
* made during an IndexWriter session can be made
*
On Mon, Oct 12, 2009 at 4:35 PM, Jake Mannix jake.man...@gmail.com wrote:
It may be surprising, but in fact I have read that
javadoc.
It was not your email I responded to.
It talks about not needing to close the
writer, but doesn't specifically talk about the what
the relationship between
Good point on isCurrent - I think it should only be with respect to
the latest index commit point? and we should clarify that in the
javadoc.
[...]
// but what does the nrtReader say?
// it does not have access to the most recent commit
// state, as there's been a commit (with documents)
//
On Wed, Sep 16, 2009 at 12:33 PM, Uwe Schindler u...@thetaphi.de wrote:
How should we proceed? Stop the final artifact build and voting or proceed
with the release of 2.9? We waited so long and for most people it is faster
than slower!
I think we know that 2.9 will not be faster for everyone:
It's been a while since I wrote that benchmarker... is it OK that the
answer is different? Did you use the same test file?
-Yonik
http://www.lucidimagination.com
On Tue, Sep 15, 2009 at 2:18 PM, Mark Miller markrmil...@gmail.com wrote:
The results:
config: impl=SeparateFile serial=false
we
need to revert FSDir.open to return SimpleFSDir again, on non-Windows
hosts. But then we don't have good concurrency...
Mike
On Tue, Sep 15, 2009 at 2:59 PM, Yonik Seeley
yonik.see...@lucidimagination.com wrote:
It's been a while since I wrote that benchmarker... is it OK
Here's my results in my quad core phenom, with ondemand CPU freq
scaling disabled (clocks locked at 3GHz)
Ubuntu 9.04, filesystem=ext4 on 7200RPM IDE drive, testfile=95MB fully cached.
Linux odin 2.6.28-15-generic #49-Ubuntu SMP Tue Aug 18 19:25:34 UTC
2009 x86_64 GNU/Linux
Java(TM) SE Runtime
On Tue, Sep 15, 2009 at 4:12 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
Note that when nthreads1 I sometimes get wrong answers for SimpleFile...
s/SimpleFile/SingleFile/g
-
To unsubscribe, e-mail: java-user-unsubscr
- everyone knows a jackalope is faster
than a koala.
- Mark
Yonik Seeley wrote:
Here's my results in my quad core phenom, with ondemand CPU freq
scaling disabled (clocks locked at 3GHz)
Ubuntu 9.04, filesystem=ext4 on 7200RPM IDE drive, testfile=95MB fully
cached.
Linux odin 2.6.28-15
OK, I see the issue - SingleFile doesn't have it's own filepointer.
I'll update the original issue. (for large files, this shouldn't
change the times any).
-Yonik
http://www.lucidimagination.com
On Tue, Sep 15, 2009 at 4:13 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
On Tue, Sep 15
On Wed, Sep 9, 2009 at 8:57 AM, Peter Keeganpeterlkee...@gmail.com wrote:
Using JProfiler, I observe that the improvement
is due to a huge reduction in the number of calls to TermDocs.next and
TermDocs.skipTo (about 65% fewer calls).
Indexes are searched per-segment now (i.e. MultiTermDocs
On Wed, Sep 9, 2009 at 9:17 AM, Yonik
Seeleyyonik.see...@lucidimagination.com wrote:
On Wed, Sep 9, 2009 at 8:57 AM, Peter Keeganpeterlkee...@gmail.com wrote:
Using JProfiler, I observe that the improvement
is due to a huge reduction in the number of calls to TermDocs.next and
TermDocs.skipTo
On Sun, Sep 6, 2009 at 4:42 AM, Shai Ereraser...@gmail.com wrote:
I've resisted using payloads for this purpose in Solr because it felt
like an interim hack until CSF is implemented.
I don't see it as a hack, but as a proper use of a great feature in Lucene.
It's proper use for an application
On Fri, Sep 4, 2009 at 12:33 AM, Shai Ereraser...@gmail.com wrote:
2) Contribute my payload-based sorting package. Currently it only reads from
disk during searches, and I'd like to enhance it to use in-memory cache as
well. It's a moderate-size package, so this one will need to wait until (1)
stuff from the index using a query as well as adding?
Does Solr also remember the deletions as well?
It used to - but now it delegates all that to IndexWriter as well (and
lucene buffers them instead).
-Yonik
http://www.lucidimagination.com
Daniel Shane
Yonik Seeley wrote:
On Fri, Aug 21
On Fri, Aug 21, 2009 at 12:49 AM, Chris
Hostetterhossman_luc...@fucit.org wrote:
: But in that case, I assume Solr does a commit per document added.
not at all ... it computes a signature and then uses that as a unique key.
IndexWriter.updateDocument does all the hard work.
Right - Solr used
Anyone have any numbers? I couldn't find complete info in the Trie*
JIRA issues, esp relating to size increase in the index.
There was this:
The indexes each contain 13 numeric, tree encoded fields (doubles and Dates).
Index size (including the normal fields) was:
* 8bit: 4.8 GiB
*
Could this perhaps have anything to do with the changes to DocIdSetIterator?
Glancing at the default implementation of advance makes me wince a bit:
public int advance(int target) throws IOException {
while (nextDoc() target) {}
return doc;
}
IMO, this is a back-compatibility
On Wed, Jul 15, 2009 at 4:37 PM, Uwe Schindleru...@thetaphi.de wrote:
And the fix only affects custom DocIdSetIterators.
And custom Queries (via Scorer) since Scorer inherits from DISI.
But as Mike says, it shouldn't be the issue behind in this thread.
-Yonik
http://www.lucidimagination.com
1 - 100 of 577 matches
Mail list logo