I am running solr 3.4 on tomcat 7.
Our index is very big , two cores each 120G. We are searching the slaves
which are replicated every 30 min.
I am using filtercache only and We have more than 90% cache hits. We use
lot of filter queries, queries are usually pretty big with 10-20 fq
parameters.
Hi
After to study apache solr documentation, I think only way to know
update records (modify, delete an insert actions) is developed a class
extends org.apache.solr.servlet.SolrUpdateServlet.
In this class, I can access updated record information go into Apache
solr server.
Somebody can
Hi,
I am going to upgrade to solr 4.1 from version 3.6, and I want to set up to
shards.
I use ConcurrentUpdateSolrServer to index the documents in solr3.6.
I saw the api CloudSolrServer in 4.1,BUT
1:CloudSolrServer use the LBHttpSolrServer to issue requests,but *
LBHttpSolrServer should NOT be
Hi
I have an id wich is a string like this.
tx-20130130-4599
i'm using a field without processing, wich i got confirmed via the analyser tool
But when i search for that it got split up, so instead of finding that specific
entry with that unique id,
it finds all entries with tx in it.
Any idea
Hello,
After more tests, we could identify our problem in indexation (Solr 4.0.0).
Indeed our problems are OutOfMemoryErrors. Thinking about Zookeeper connection
problems was a mistake. We have thought about this because OOME sometimes
appear in logs after errors on Zookeeper leader election.
Does debugQuery=true tell anything useful for these? Like what is the
component taking most of the 30 seconds. Do you have evictions in your solr
caches?
Dmitry
On Thu, Jan 31, 2013 at 10:01 AM, Mou mouna...@gmail.com wrote:
I am running solr 3.4 on tomcat 7.
Our index is very big , two
It could be a foolish question or concern, but I have no option :-) . We do
have an e-com site where we consuming the feed from the CSE partners and
indexing it in to SOLR for our search. Instead of the traditional
auto-suggest, the predictive search in the header search box recommends the
Hi list,
I recognized that the result order is FIFO if documents have the same score.
I think this is due to the fact that documents which are indexed later get a
higher
internal document ID and the output for documents with the same score starts
with the lowest internal document ID and raises.
which analyzer are you using to index that field , you can verify that
from schema file .
thanks
On Thu, Jan 31, 2013 at 2:35 PM, b.riez...@pixel-ink.de
b.riez...@pixel-ink.de wrote:
Hi
I have an id wich is a string like this.
tx-20130130-4599
i'm using a field without processing,
Part of this is a rant, part is a plea to others who've run successful
production deployments.
Solr is a second-class citizen when it comes to production deployment. Every
recipe I've seen (RPM, DEB, chef, or puppet) makes assumptions that in one way
or another run afoul of best-practices when
Hi,
So am I correct in thinking that I add the jira myself, if so can I add it do
the 4.2 release? Also I have further questions about the scope of my patch,
should that be left to the comments of the jira itself?
Phil
-Original Message-
From: Otis Gospodnetic
Is there a way to do an atomic update (inc by 1) and retrieve the updated value
in one operation?
You can also do all this via HTTP commands, see:
http://wiki.apache.org/solr/SolrReplication#HTTP_API
that allows you to control _all_ replication from the master (i.e. tell the
master don't to any replication) or just tell a slave don't replicate
any more as well as a lot of other stuff.
Best
I'm really surprised you're hitting OOM errors, I suspect you have
something else pathological in your system. So, I'd start checking things
like
- how many concurrent warming searchers you allow
- How big your indexing RAM is set to (we find very little gain over 128M
BTW).
- Other load on your
Hi,
I am stuck trying to index only the nouns of german and english texts.
(very similar to http://wiki.apache.org/solr/OpenNLP#Full_Example)
First try was to use UIMA with the HMMTagger:
processor
class=org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory
lst name=uimaConfig
So, it depends of your business requirement, right? If a document has
matches in more searchable fields, at least for me, this document is more
important than other document that has less matches.
Example:
Put this in your schema:
similarity class=com.your.namespace.NoIDFSimilarity /
And create
Hi,
I solved the issue by setting up two different virtual network adapters in
ubuntu server.
case closed ;)
thanks for the help!!
--
View this message in context:
Hi people,
First of all this forum is a god sent!!!
Second:
I have a master / slave configuration, using replication.
Currently in production I have only one server, there's no backup server
(really...).
The webapplication is a public webapplication, everyone can see it.
- How often, in
On Thu, Jan 31, 2013 at 5:13 AM, Scott Stults
sstu...@opensourceconnections.com wrote:
Right now that blessed container is Jetty version 8.1.2.v20120308.
I'd really like some confirmation from the devs that there really is a
blessed status for a given container that provides advantages over
Hello Erick,
Thanks for your answer.
After reading previous subjects on the user list, we had already tried to
change the parameters we mentioned.
- concurrent warming searchers : we have set the maxWarmingSearchers attribute
to 2
maxWarmingSearchers2/maxWarmingSearchers
- we have tried 32
- How often, in your experience, and why, would solr crash?
Not very often. Typically if your heap is too small, you'll end up going OOM.
- If I kill solr master and slave, usually do I need to also delete the
indexes? Or everything should be fine upon restarting?
Restarts are fine. Order
Fantastic! Thanks very much.. I will do so accordingly and will let you
know the results.
Thanks again,
Sandeep
On 31 January 2013 13:54, Felipe Lahti fla...@thoughtworks.com wrote:
So, it depends of your business requirement, right? If a document has
matches in more searchable fields, at
Thanks for your reply.
No, there is no eviction, yet.
The time is spent mostly on org.apache.solr.handler.component.QueryComponent
to process the request.
Again, the time varies widely for same query.
--
View this message in context:
Are you using eDismax? Maybe your ID field is not part of the search fields
or not a high priority. And, just maybe, you are doing a copyField * to
text and the text splits the ID into parts. Enable the debug on your query
and you should be able to figure it out.
Regards,
Alex.
Personal blog:
jack Thanks for your response..
we have a deal web application.. and having free text search in it . here
free text
means you can type any thing in it..
we have deals of different categories.. and tagged at different
merchant locations..
As per requirement i have to do some tweaks in
We have a Chef regime here, and I've written Tomcat and Solr recipes
to be played against Ubuntu 12.04 Server.
We do mostly the same: chef to install Tomcat (with configuration
appropriate to Solr), but then instead of deploying Solr via chef, we use
an ant script to package and deploy a war
UIMA:
I just found this issue https://issues.apache.org/jira/browse/SOLR-3013
Now I am able to use this analyzer for english texts and filter (un)wanted
token types :-)
fieldType name=uima_nouns_en class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer
Thanks Shawn. Actually now that I think about it, Yonik also mentioned
something about lucene number representation once in reply to one of my
questions. Here it is:
Could you also tell me what these `#8;#0;#0;#0;#1; strings represent in the
debug output?
That's internally how a number is
Hello,
I have a field text with type text_general here.
fieldType name=text_general class=solr.TextField
positionIncrementGap=100
analyzer type=index
tokenizer
class=solr.StandardTokenizerFactory /
+text:a +b
-- Jack Krupansky
-Original Message-
From: Bing Hua
Sent: Thursday, January 31, 2013 12:59 PM
To: solr-user@lucene.apache.org
Subject: Search match all tokens in Query Text
Hello,
I have a field text with type text_general here.
fieldType name=text_general
Thanks for the quick reply. Seems like you are suggesting to add explicitly
AND operator. I don't think this solves my problem.
I found it solrQueryParser defaultOperator=AND/ somewhere, and this
works.
--
View this message in context:
On 1/31/2013 1:01 AM, Mou wrote:
I am running solr 3.4 on tomcat 7.
Our index is very big , two cores each 120G. We are searching the slaves
which are replicated every 30 min.
I am using filtercache only and We have more than 90% cache hits. We use
lot of filter queries, queries are usually
Thank you Shawn for reading all of my previous entries and for a detailed
answer.
To clarify, the third shard is used to store the recently added/updated
data. Two main big cores take very long to replicate ( when a full
replication is required) so the third one helps us to return the newly
I'm having an issue getting the splitBy construct from the regex
transformer to work in a very basic case (with either Solr 3.6 or
4.1).
I have a field defined like this:
field stored=true name=type type=string multiValued=true/
The entity is defined like this:
entity name=item
In your unit test, you have:
field column=\type\ name=\type\ splitBy=\\\|\ / +
And also:
runner.update(INSERT INTO test VALUES 1, 'foo,bar,baz');
So you need to decide if you want to delimit with a pipe or a comma.
James Dyer
Ingram Content Group
(615) 213-4311
-Original Message-
On 1/31/2013 12:47 PM, Mou wrote:
To clarify, the third shard is used to store the recently added/updated
data. Two main big cores take very long to replicate ( when a full
replication is required) so the third one helps us to return the newly
indexed documents quickly. It gets deleted every
Sorry about that - even if I switch the splitBy to , it still
doesn't work. Here's the corrected unit test:
http://pastie.org/5995399
On Thu, Jan 31, 2013 at 12:30 PM, Dyer, James
james.d...@ingramcontent.com wrote:
In your unit test, you have:
field column=\type\ name=\type\ splitBy=\\\|\ / +
Shawn Heisey [s...@elyograg.org] wrote:
[...]
If you have a total index size for this JVM of 240GB, then you may not
have enough RAM to let the OS disk cache work efficiently. For that
size of index, I would plan on a system with at least 128GB of RAM,
256GB would be better.
[...]
One of
The ping handler is how we tell our load balancers that our Solr cores
are healthy. I guess if you're running more than one core behind the
same balancer, it would make sense to drop a webapp in there that ran
the ping queries for all your cores and only responded OK if they all
came back OK.
Or
On Jan 31, 2013, at 10:15 AM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
I'd really like some confirmation from the devs that there really is a
blessed status for a given container that provides advantages over
others.
IMO: jetty is what all of our unit/integration tests
Hi,
I believe each stemmer implementation decides that themselves. At least the
MinimalNorwegianStemmer has a built-in logic which stems certain suffixes only
if the token is N chars.
If you want external control, you can look at
That's surprising to me, mostly because a number of the Solr wiki
pages don't really make that strong of a case for it:
http://wiki.apache.org/solr/SolrInstall
http://wiki.apache.org/solr/SolrTomcat
http://wiki.apache.org/solr/SolrJetty
Would it make sense to spell that out somewhere?
I do
Thanks for confirming my suspicions, the custom
TokenLengthMarkerFilterFactory sounds like the best approach for doing this.
On Thu, Jan 31, 2013 at 5:12 PM, Jan Høydahl jan@cominvent.com wrote:
Hi,
I believe each stemmer implementation decides that themselves. At least
the
On 1/31/2013 3:21 PM, Michael Della Bitta wrote:
I do notice that it seems like the version of Jetty that ships with
Solr isn't the preferred one according to the wiki, so that would be
an extra dependency for a config management system like Chef.
Near as I can tell, the versions of jetty that
For what it's worth, Google has done some pretty interesting research into
coping with the idea that particular shards might very well be busy doing
something else when your query comes in.
Check out this slide deck: http://research.google.com/people/jeff/latency.html
Lots of interesting
It is possible to do this with IP Multicast. The query goes out on the
multicast and all query servers read it. The servers wait for a random
amount of time, then transmit the answer. Here's the trick: it's
multicast. All of the query servers listen to each other's responses,
and drop out when
Thanks, Kai!
About removing non-nouns: the OpenNLP patch includes two simple
TokenFilters for manipulating terms with payloads. The
FilterPayloadFilter lets you keep or remove terms with given payloads.
In the demo schema.xml, there is an example type that keeps only
nounsverbs.
There is a
Thank you again.
Unfortunately the index files will not fit in the RAM.I have to try using
document cache. I am also moving my index to SSD again, we took our index
off when fusion IO cards failed twice during indexing and index was
corrupted.Now with the bios upgrade and new driver, it is
48 matches
Mail list logo