solr 1.4.x uses 2.9.x of lucene
you could try the trunk which uses lucene 3.0.3 and should be compatible
if I'm correct
Regards,
Peter.
I have the exact opposite problem where Luke won't even load the index but
Solr starts fine. I believe there are major differences between the two
indexes
First, its more Solandra now (although the project is still named
lucandra) ;)
Second, it can help because data which is written to the index is
immediately (configurable) available for search.
solandra is distributed + real time solr, with no changes required on
client side (be it SolrJ or
Am 18.01.2011 22:33, schrieb Steven A Rowe:
[] ASF Mirrors (linked in our release announcements or via the Lucene
website)
[x] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.)
[x] I/we build them from source via an SVN/Git checkout.
take a look also into icu4j which is one of the contrib projects ...
converting on the fly is not supported by Solr but should be relative
easy in Java.
Also scanning is relative simple (accept only a range). Detection too:
http://www.mozilla.org/projects/intl/chardet.html
We've created an
Dinesh,
it will stay 'real time' even if you convert it. Converting should be
done in the millisecond range if at all measureable (e.g. if you apply
streaming).
Beware: To use the real features you'll need the latest trunk of solr IMHO.
I've done similar log-feeding stuff here (with code!):
converting on the fly is not supported by Solr but should be relative
easy in Java.
Also scanning is relative simple (accept only a range). Detection too:
http://www.mozilla.org/projects/intl/chardet.html
We've created an index from a number of different documents that are
supplied by third
Hi all!
Would you mind to write about your Solr project if it has an uncommon
approach or if it is somehow exciting?
I would like to extend my list for a new blog post.
Examples I have in mind at the moment are:
loggly (real time + big index),
solandra (nice solr + cassandra combination),
haiti
Am 04.01.2011 21:43, schrieb Ahmet Arslan:
Is that supported? Pointer(s)
to how to do it?
perhaps http://wiki.apache.org/solr/LukeRequestHandler ?
or via
ssh u...@host -X
;-)
how did you remove the term? In the spellcheck file?
did you rebuild the spellcheck index?
Regards,
Peter.
Hi,
I have configured spellchecker in solrconfig.xml and it is working fine for
existing terms. However, if i delete a term, it is still being returned as a
suggestion from the
you should try fq=Product:Electric Guitar
How do I handle facet values that contain whitespace? Say I have a field
Product that I want to facet on. A value for Product could be Electric
Guitar. How should I handle the white space in Electric Guitar during
indexing? What about when I
facets=truefacet.field=field // SELECT count(distinct(field))
fq=field:[* TO *] // WHERE length(field) 0
q=other_criteriaAfq=other_criteriaB// AND other_criteria
advantage: you can look into several fields at one time when adding
another facet.field
disadvantage: you get the counts splitted by
the current stable release is 1.4.1 (before there was 1.4)
it has nothing todo with java's version numbers! (own release cycle)
the next release will be 3.x:
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/
and then 4.x (current trunk):
Building on optimize is not possible as index optimization is done on
the master and the slaves don't even run an optimize but only fetch
the optimized index.
isn't the spellcheck index replicated to the slaves too?
--
http://jetwick.com open twitter search
Hi Hamid,
try to avoid autowarming when indexing (see solrconfig.xml:
caches-autowarm + newSearcher + maxSearcher).
If you need to query and indexing at the same time,
then probably you'll need one read-only core and one for writing with no
autowarming configured.
See:
Am 07.12.2010 13:01, schrieb Hamid Vahedi:
Hi Peter
Thanks a lot for reply. Actually I need real time indexing and query at the same
time.
Here told:
You can run multiple Solr instances in separate JVMs, with both having their
solr.xml configured to use the same index folder.
Now
Q1: I'm
for dismax just pass an empty query all q= or none at all
Hello,
shouldn't that query syntax be *:* ?
Regards,
-- Savvas.
On 6 December 2010 16:10, Solr Usersolr...@gmail.com wrote:
Hi,
First off thanks to the group for guiding me to move from default search
handler to dismax.
I have a
I'm unsure but maybe you mean something like clustering? Then carrot^2
can do this (at index time I think):
http://search.carrot2.org/stable/search?query=jetwickview=visu
(There is a plugin for solr)
Or do you already know the categories of your docs. E.g. you already
have a category tree and
QueryElevationComponent requires the
schema to have a uniqueKeyFie
ld implemented using StrField
you should use the type StrField ('string') for the field used in
uniqueKeyField
for 1) use the tomcat configuration in conf/server.xml Connector
address=127.0.0.1 port=8080 ...
for 2) if they have direct access to solr either insert a middleware
layer or create a write lock ;-)
Hello all,
1)
I want to restrict access to Solr only in localhost. How to acheive that?
2)
Hi,
also take a look at solandra:
https://github.com/tjake/Lucandra/tree/solandra
I don't have it in prod yet but regarding administration overhead it
looks very promising.
And you'll get some other neat features like (soft) real time, for free.
So its same like A) + C) + X) - Y) ;-)
also try to minimize maxWarming searchers to 1(?) or 2.
And decrease cache usage (especially autowarming) if possible at all.
But again: only if it doesn't affect performance ...
Regards,
Peter.
On Tue, Nov 30, 2010 at 6:04 PM, Robert Petersenrober...@buy.com wrote:
My question is this.
take a look into this:
http://vimeo.com/16102543
for that amount of data it isn't that easy :-)
We are looking into building a reporting feature and investigating solutions
which will allow us to search though our logs for downloads, searches and
view history.
Each log item is relatively
Hi,
another way is to use facets for the tagcloud as we did it in jetwick.
Every document then needs a tag field (multivalued).
See:
https://github.com/karussell/Jetwick/blob/master/src/main/java/de/jetwick/ui/TagCloudPanel.java
for an example with wicket and SolrJ. With that you could also
Jetwick is now available under the Apache 2 license:
http://www.pannous.info/2010/11/jetwick-is-now-open-source/
Regards,
Peter.
PS:
features http://www.pannous.info/products/jetwick-twitter-search/
installation https://github.com/karussell/Jetwick/wiki
for devs
Hi,
the final solution is explained here in context:
http://mail-archives.apache.org/mod_mbox/lucene-dev/201011.mbox/%3caanlktimatgvplph_mgfbsughdoedc8tc2brrwxhid...@mail.gmail.com%3e
/If you are using Solr branch_3x or trunk, you can turn this off, by
setting autoGeneratePhraseQueries to
Hi Rajani,
some notes:
* try spellcheck.q=curst or completely without spellcheck.q but with q
* compared to the normal q parameter spellcheck.q can have a different
analyzer/tokenizer and is used if present
* do not do spellcheck.build=true for every request (creating the
spellcheck index
Hi Peter!
* I believe the NRT patches are included in the 4.x trunk. I don't
think there's any support as yet in 3x (uses features in Lucene 3.0).
I'll investage how much effort it is to update to solr4
* For merging, I'm talking about commits/writes. If you merge while
commits are going
Hi,
I am going crazy but which config is necessary to include the missing doc 2?
I have:
doc1 tw:aBc
doc2 tw:abc
Now a query aBc returns only doc 1 although when I try doc2 from
admin/analysis.jsp
then the term text 'abc' of the index gets highlighted as intended.
I even indexed a simple
Does yours need to be once a day?
no, I only thought you use one day :-)
so you don't or do you have 31 shards?
having a look at Solr Cloud or Katta - could be useful
here in dynamically allocating shards.
ah, thx! I will take a look at it (after trying solr4)!
Regards,
Peter.
Hi,
Please add preserveOriginal=1 to your WDF [1] definition and reindex (or
just try with the analysis page).
but it is already there!?
filter class=solr.WordDelimiterFilterFactory protected=protwords.txt
generateWordParts=1 generateNumberParts=1
catenateAll=0
Peter,
I recently had this issue, and I had to set splitOnCaseChange=0 to
keep the word delimiter filter from doing what you describe. Can you
try that and see if it helps?
- Ken
Hi Ken,
yes this would solve my problem,
but then I would lost a match for 'SuperMario' if I query 'mario',
You are applying the sort against a (tokenized) text field?
You should better sort against a number or a string. Probably using the
copyField directive.
Regards,
Peter.
hi all:
I configure a solr application and there is a field of type text,and some
kind like this 123456, that is a
take a look if the 'more like this' handler can solve your problem.
Hi.
I wonder is it possible in built-in way to make context search in
Solr?
I have about 50k documents (mainly 'name' of char(150)), so i receive
a content of a page and should show found documents.
Of course i
Hi Peter,
thanks for your response. I will dig into the sharding stuff asap :-)
This may have changed recently, but the NRT stuff - e.g. per-segment
commits etc. is for the latest Solr 4 trunk only.
Do I need to turn something 'on'?
Or do you know wether the NRT patches are documented
Hi,
I wanted to provide my indexed docs (tweets) relative fast: so 1 to 10
sec or even 30 sec would be ok.
At the moment I am using the read only core scenario described here
(point 5)*
with a commit frequency of 180 seconds which was fine until some days.
(I am using solr1.4.1)
Now the
Just in case someone is interested:
I put the emails of Peter Sturge with some minor edits in the wiki:
http://wiki.apache.org/solr/NearRealtimeSearchTuning
I found myself search the thread again and again ;-)
Feel free to add and edit content!
Regards,
Peter.
Hi Erik,
I thought this
in 1.4, and generally no longer takes a lot of memory -- for facets
with many unique values, method fc in fact should take less than
enum, I think?
Peter Karich wrote:
Just in case someone is interested:
I put the emails of Peter Sturge with some minor edits in the wiki:
http
=enum is still valid in
Solr 1.4+. The fc facet.method was changed significantly in 1.4, and
generally no longer takes a lot of memory -- for facets with many unique
values, method fc in fact should take less than enum, I think?
Peter Karich wrote:
Just in case someone is interested:
I put
take a look here
http://stackoverflow.com/questions/33956/how-to-get-facet-ranges-in-solr-results
I am able to facet on a particular field because I have index on that field.
But I am not sure how to facet on a price range when I have the exact price
in the 'price' field. Can anyone help
Hi Ron,
how do I know what the starting row
Always 0.
especially if the original SolrQuery object has them all
thats the point. solr will normally cache it for you. This is your friend:
queryResultWindowSize40/queryResultWindowSize
!-- Maximum number of documents to cache for any entry
what you can try maxSegments=2 or more as a 'partial' optimize:
If the index is so large that optimizes are taking longer than desired
or using more disk space during optimization than you can spare,
consider adding the maxSegments parameter to the optimize command. In
the XML message, this
Hi,
don't know if the python package provides one but solrj offers to start
solr embedded (|EmbeddedSolrServer|) and
setting up different schema + config is possible. for this see:
https://karussell.wordpress.com/2010/06/10/how-to-test-apache-solrj/
if you need an 'external solr' (via jetty
From the user perspective I wouldn't delete it, because it could be
that down-voting by mistake or spam or something and up-voting can
resurrect it.
It could be also wise to keep the docs to see which content (from which
users?) are down voted to get spam accounts?
From the dev perspective
we have an identical-sized index and it takes ~5minutes
It takes about one hour to replacate 6G index for solr in my env. But
my network can transfer file about 10-20M/s using scp. So solr's http
replcation is too slow, it's normal or I do something wrong?
In case someone is interested:
http://karussell.wordpress.com/2010/10/27/feeding-solr-with-its-own-logs/
a lot of TODOs but: it is working. I could also imagine that this kind
of example would be suited for an intro-tutorial,
because it covers dynamic fields, rapid solr prototyping, filter and
Hi Xin,
from the wiki:
http://wiki.apache.org/solr/SolrConfigXml
The URL of the ping query is* /admin/ping
* You can also check (via wget) the number of documents. it might look
like a rusty hack but it works for me:
wget -T 1 -q http://localhost:8080/solr/select?q=*:*; -O - | tr '/'
Hi,
See this:
http://wiki.apache.org/solr/CoreAdmin#RELOAD
Solr will also load the new configuration (without restart the webapp)
on the slaves when using replication:
http://wiki.apache.org/solr/SolrReplication
Regards,
Peter.
Hi Everybody,
If I change my schema.xml to, do I have to
Hi,
we had the following problem. We added a field to schema.xml and fed our
master with the new data.
After that querying on the master is fine. But when we replicated
(solr1.4.0) to our slaves.
All slaves said they cannot find the new field (standard exception for
missing fields).
And that
Hi,
you can try to parse the xml via Java yourself and then push the
SolrInputDocuments it via SolrJ to solr.
setting format to binaray + using the streaming update processor should
improve performance,
but I am not sure... and performant (+less mem!) reading xml in Java is
another topic ... ;-)
I asked this myself ... here could be some pointers:
http://lucene.472066.n3.nabble.com/SolrJ-and-Multi-Core-Set-up-td1411235.html
http://lucene.472066.n3.nabble.com/EmbeddedSolrServer-in-Single-Core-td475238.html
Hi everyone,
I'm trying to write some code for creating and using multi cores.
Hi,
answering my own question(s).
Result grouping could be the solution as I explained here:
https://issues.apache.org/jira/browse/SOLR-385
http://www.cs.cmu.edu/~ddash/papers/facets-cikm.pdf (the file is dated to Aug
2008)
yonik implemented this here:
just a blind shot (didn't read the full thread):
what is your maxWarmingSearchers settings? For large indices we set it
to 2 (maximum)
Regards,
Peter.
just update on this issue...
we turned off the new/first searchers (upgrade to Solr 1.4.1), and ran
benchmark tests, there is no noticeable
Should I create a JIRA ticket?
already there:
https://issues.apache.org/jira/browse/SOLR-2005
we should provide a patch though ...
Regards,
Peter.
With solrj doing a more like this query for a missing document:
/mlt?q=docId:SomeMissingId
always throws a null pointer exception:
Caused
Hi Jason,
Hi, all.
I got some question about solrconfig.xml.
I have 10 fields in a document for index.
(Suppose that field names are f1, f2, ... , f10.)
Some user will want to search in field f1 and f5.
Another user will want to search in field f2, f3 and f7.
I am going to use dismax
Hi,
are you using moreLikeThis for that feature?
I have no suggestion for a reliable threshold, I think this depends
on the domain you are operating and is IMO only solvable with a heuristic.
It also depends on fields, boosts, ...
It could be that there is a 'score gap' between duplicates and
I'm not sure ... just reading it yesterday night ...
but isn't the unapplied patch from Harish
https://issues.apache.org/jira/secure/attachment/12400054/SOLR-680.patch
what you want?
Regards,
Peter.
Running 1.4.1.
I'm able to execute stats queries against multi-valued fields, but when
given
Hi Olivier,
maybe the slave replicates after startup? check replication status here:
http://localhost/solr/admin/replication/index.jsp
what is your poll frequency (could you paste the replication part)?
Regards,
Peter.
Hello,
I setup a server for the replication of Solr. I used 2 cores and
Hi Olivier,
the index size is relative big and you enabled replication after startup:
str name=replicateAfterstartup/str
This could explain why the slave is replicating from the very beginning.
Are the index versions/generations the same? (via command or
admin/replication)
If not, the slaves
Hi,
I need a feature which is well explained from Mr Goll at this site **
So, it then would be nice to do sth. like:
facet.stats=sum(fieldX)facet.stats.sort=fieldX
And the output (sorted against the sum-output) can look sth. like this:
lst name=facet_counts
lst name=facet_fields
lst
Hi,
there are two relative similar solutions for this problem.
I will describe one of them:
* create a multivalued string field called 'category'
* you have a category tree. so make sure a document gets not only the
leaf category, but all categories (name or id) until the root
* now facet
Hi,
there is a solution without the patch. Here it should be explained:
http://www.lucidimagination.com/blog/2010/08/11/stumped-with-solr-chris-hostetter-of-lucene-pmc-at-lucene-revolution/
If not, I will do on 9.10.2010 ;-)
Regards,
Peter.
I've a similar problem with a project I'm working on
also take a look at:
http://wiki.apache.org/solr/HierarchicalFaceting
+ SOLR-64, SOLR-792
+ http://markmail.org/message/jxbw2m5a6zq5jhlp
Regards,
Peter.
Take a look at Mastering the Power of Faceted Search with Chris
Hostetter
(http://www.lucidimagination.com/solutions/webcasts/faceting). I
How long does it take to get 1000 docs?
Why not ensure this while indexing?
I think besides your suggestion or the suggestion of Luke there is no
other way...
Regards,
Peter.
Hello,
What would be the best way to check Solr index against original system
(Database) to make sure index is up to
Jonathan,
this field described here from Chantal:
2.) create an additional field that stores uses the
String type with the same content (use copy field to fill either)
can be multivalued. Or what did you mean?
BTW: The nice thing about facet.prefix is that you can add an arbitrary
(filter)
see
http://stackoverflow.com/questions/88235/how-to-deal-with-java-lang-outofmemoryerror-permgen-space-error
and the links there. There seems to be no good solution :-/
The only reliable solution is restart, before you haven't enough
permgenspace (use jvisualvm to monitor)
And try to increase
Hi,
if you index your doc with text='operating system' with an additional
keyword field='linux'
(of type string, can be multivalued) then solr facetting should be what
you want:
solr/select?q=*:*facet=truefacet.field=keywordrows=10 or rows=0
depending on your needs
Does this help?
Regards,
, as the RO instance is simply another shard in the pack.
On Sun, Sep 12, 2010 at 8:46 PM, Peter Karich peat...@yahoo.de wrote:
Peter,
thanks a lot for your in-depth explanations!
Your findings will be definitely helpful for my next performance
improvement tests :-)
Two questions:
1. How
Peter,
thanks a lot for your in-depth explanations!
Your findings will be definitely helpful for my next performance
improvement tests :-)
Two questions:
1. How would I do that:
or a local read-only instance that reads the same core as the indexing
instance (for the latter, you'll need
Hi there,
I don't know if my idea is perfect but it seems to work ok in my
twitter-search prototype:
http://www.jetwick.com
(keep in mind it is a vhost and only one fat index, no sharding, etc...
so performance isn't perfect ;-))
That said, type in 'so' and you will get 'soldier', 'solar', ...
Hi,
Solr is only able to handle unicode (UTF-8).
Make really sure that you push it into the index in the correct encoding.
See my (accepted ;-)) answer:
http://stackoverflow.com/questions/3086367/how-to-view-the-xml-documents-sent-to-solr/3088515#3088515
Regards,
Peter.
I have an index that
aaah okay.
so its SolrDocument in normal search never been used ? its only for other
solr-plugins ?
SolrDocument is under org.apache.solr.common which is for the
solr-solj.jar and not available for the solr-core.jar
see e.g.:
Hi,
that issue is not really related to solr. See this:
http://stackoverflow.com/questions/88235/how-to-deal-with-java-lang-outofmemoryerror-permgen-space-error
Increasing maxpermsize -XX:MaxPermSize=128m does not really solve this
issue but you will see less errros :-)
I have written a mini
Hi!
What do you mean? You want a quickstart?
Then see
http://lucene.apache.org/solr/tutorial.html
(But I thought you already got solr working (from previous threads)!?)
Or do you want to know if solr is running? Then try the admin view:
http://localhost:8080/solr/admin/
Regards,
Peter.
Hi
Hi Ankita,
first: thanks for trying apache solr.
does all the data to be indexed has to be in exampledocs folder?
No. And there are several ways to push data into solr: via indexing,
dataimporthandler, solrj, ...
I know that getting comfortable with a new project is a bit complicated
at
Thanks a lot Yonik! Rounding makes sense.
Is there a date math for the 'LAST_COMMIT'?
Peter.
On Tue, Aug 17, 2010 at 6:29 PM, Peter Karich peat...@yahoo.de wrote:
my queryResultCache has no hits. But if I am removing one line from the
bf section in my dismax handler all is fine. Here
Hi Yonik,
would you point me to the Java classes where solr handles a commit or an
optimize and then the date math definitions?
Regards,
Peter.
On Wed, Aug 18, 2010 at 4:34 PM, Peter Karich peat...@yahoo.de wrote:
Thanks a lot Yonik! Rounding makes sense.
Is there a date math
forget to say: thanks again! Now the cache gets hits!
Regards,
Peter.
On Wed, Aug 18, 2010 at 4:34 PM, Peter Karich peat...@yahoo.de wrote:
Thanks a lot Yonik! Rounding makes sense.
Is there a date math for the 'LAST_COMMIT'?
No - but it's an interesting idea!
-Yonik
http
Is there a way to verify that I have added correctlly?
on linux you can do
ps -elf | grep Boot
and see if the java command has the parameters added.
@all: why and when do you get those OOMs? while querying? which queries
in detail?
Regards,
Peter.
Hi Wenca,
I am not sure wether my information here is really helpful for you,
sorry if not ;-)
I want only hotels that have room with 2 beds and the room has a
package with all inclusive boarding and price lower than 400.
you should tell us what you want to search and filter? Do you want only
is just
5-6 GB yet that particular error is seldom observed... (SEVERE ERROR : JAVA
HEAP SPACE , OUT OF MEMORY ERROR )
I could see one lock file generated in the data/index path just after this
error.
On Tue, Aug 17, 2010 at 4:49 PM, Peter Karich peat...@yahoo.de wrote
.
I am new to Solr so excuse me if I don't use the right terminology
yet, but I hope that my description of the use case is quite clear
now. ;-)
Thanks
Wenca
Dne 17.8.2010 13:46, Peter Karich napsal(a):
Hi Wenca,
I am not sure wether my information here is really helpful for you,
sorry
Hi all,
my queryResultCache has no hits. But if I am removing one line from the
bf section in my dismax handler all is fine. Here is the line:
recip(ms(NOW,date),3.16e-11,1,1)
According to
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
this should be
Hi Robert!
Since the example given was http being slow, its worth mentioning that if
queries are one word urls [for example http://lucene.apache.org] these
will actually form slow phrase queries by default.
do you mean that http://lucene.apache.org will be split up into http
lucene
filter class=solr.CommonGramsQueryFilterFactory words=new400common.txt/
/analyzer
/fieldType
Tom
-Original Message-
From: Peter Karich [mailto:peat...@yahoo.de]
Sent: Tuesday, August 10, 2010 3:32 PM
To: solr-user@lucene.apache.org
Subject: Re: Improve Query Time For Large Index
words list. (Details on CommonGrams
here:
http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-2)
Tom Burton-West
-Original Message-
From: Peter Karich [mailto:peat...@yahoo.de]
Sent: Tuesday, August 10, 2010 9:54 AM
To: solr-user
I wonder too, that there shouldn't be a special tool which analyzes solr
logfiles (e.g. parses qtime, the parameters q, fq, ...)
Because there are some other open source log analyzers out there:
http://yaala.org/ http://www.mrunix.net/webalizer/
Another free tool is newrelic.com (you will
Hi,
I have 5 Million small documents/tweets (= ~3GB) and the slave index
replicates itself from master every 10-15 minutes, so the index is
optimized before querying. We are using solr 1.4.1 (patched with
SOLR-1624) via SolrJ.
Now the search speed is slow 2s for common terms which hits more than
)
Tom Burton-West
-Original Message-
From: Peter Karich [mailto:peat...@yahoo.de]
Sent: Tuesday, August 10, 2010 9:54 AM
To: solr-user@lucene.apache.org
Subject: Improve Query Time For Large Index
Hi,
I have 5 Million small documents/tweets (= ~3GB) and the slave index
replicates
Ophir,
this sounds a bit strange:
CommonsHttpSolrServer.java, line 416 takes about 95% of the application's
total search time
Is this only for heavy load?
Some other things:
* with lucene you accessed the indices with MultiSearcher in a LAN, right?
* did you look into the logs of the
The default solr solution is client side loadbalance.
Is there a solution provide the server side loadbalance?
No. Most of us stick a HTTP load balancer in front of multiple Solr servers.
E.g. mod_jk is a very easy solution (maybe too simple/stupid?) for a
load balancer,
but it
to be reopened, and this happens on commit.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
From: Peter Karich peat...@yahoo.de
To: solr-user@lucene.apache.org
Sent: Fri, July 30, 2010 6:19
before the warmup queries
from the previous commit have done their magic, you might be getting
into a death spiral.
HTH
Erick
On Thu, Jul 29, 2010 at 7:02 AM, Peter Karich peat...@yahoo.de wrote:
Hi,
I am indexing a solr 1.4.0 core and commiting gets slower and slower.
Starting from 3-5
Both approaches are ok, I think. (although I don't know the python API)
BTW: If you query q=*:* then add rows=0 to avoid some traffic.
Regards,
Peter.
I want to programmatically retrieve the number of indexed documents. I.e.,
get the value of numDocs.
The only two ways I've come up with are
Hi Peter :-),
did you already try other values for
hl.maxAnalyzedChars=2147483647
? Also regular expression highlighting is more expensive, I think.
What does the 'fuzzy' variable mean? If you use this to query via
~someTerm instead someTerm
then you should try the trunk of solr which is a lot
is pretty frequent for Solr.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
From: Peter Karich peat...@yahoo.de
To: solr-user@lucene.apache.org
Sent: Fri, July 30, 2010 4:06:48 PM
Hi,
I am indexing a solr 1.4.0 core and commiting gets slower and slower.
Starting from 3-5 seconds for ~200 documents and ending with over 60
seconds after 800 commits. Then, if I reloaded the index, it is as fast
as before! And today I have read a similar thread [1] and indeed: if I
set
Hi Muneeb,
I fear you'll have no chance: replicating an index will use more disc
space on the slave nodes.
Of course, you could minimize disc usage AFTER the replication via the
'optimize-hack'.
But are you sure the reason for the slave-node die, is due to disc
limitations?
Try to observe the
We have three dedicated servers for solr, two for slaves and one for master,
all with linux/debian packages installed.
I understand that replication does always copies over the index in an exact
form as in master index directory (or it is supposed to do that at least),
and if the master
Hi Girish,
I am not aware of such a thing.
But you could use a middleware to avoid certain fields from being
retrieved via the 'fl' parameter:
http://wiki.apache.org/solr/CommonQueryParameters#fl
E.g. for your customers the query looks like q=hellofl=title and for
your admin the query looks like
did you try an optimize on the slave too?
Yes I always run an optimize whenever I index on master. In fact I just ran
an optimize command an hour ago, but it didn't make any difference.
1 - 100 of 155 matches
Mail list logo