Hello,
i am using sold 4.1.0 and ihave used sold cloud in my product.I have found
at first everything seems good,the search time is fast and delay is slow,but it
becomes very slow after days.does any one knows if there maybe some params or
optimization to use sold cloud?
Could you give more info about your index size and technical details of
your machine? Maybe you are indexing more data day by day and your RAM
capability is not enough anymore?
2013/4/19 qibaoyuan qibaoy...@gmail.com
Hello,
i am using sold 4.1.0 and ihave used sold cloud in my product.I
there are 6 shards and they are in one machine,and the jvm param is very
big,the physical memory is 16GB,the total #docs is about 150k,the index size of
each shard is about 1GB.AND there is indexing while searching,I USE auto commit
each 10min.and the data comes about 100 per minutes.
在
Well, to consume 120GB of RAM with a 120GB index, you would have to query
over every single GB of data.
If you only actually query over, say, 500MB of the 120GB data in your dev
environment, you would only use 500MB worth of RAM for caching. Not 120GB
On Fri, Apr 19, 2013 at 7:55 AM, David
Interesting. I'm trying to correlate this new understanding to what I see on
my servers. I've got one server with 5GB dedicated to solr, solr dashboard
reports a 167GB index actually.
When I do many typical queries I see between 3MB and 9MB of disk reads
(watching iostat).
But solr's dashboard
Can happen for various reasons.
Can you recreate the situation, meaning restarting the servlet or server
would start with good qTime and decrease from that point? How fast does
this happen?
Start by monitoring the jvm process, with oracle visualVM for example.
Monitor for frequent garbage
Thanks manu,i will check it.
在 2013-4-19,下午4:26,Manuel Le Normand manuel.lenorm...@gmail.com 写道:
Can happen for various reasons.
Can you recreate the situation, meaning restarting the servlet or server
would start with good qTime and decrease from that point? How fast does
this happen?
Can you instead use paging mechanism?
On Thu, Apr 18, 2013 at 8:03 PM, Jie Sun jsun5...@yahoo.com wrote:
Hi -
when I execute a shard query like:
On 4/19/2013 1:34 AM, John Nielsen wrote:
Well, to consume 120GB of RAM with a 120GB index, you would have to query
over every single GB of data.
If you only actually query over, say, 500MB of the 120GB data in your dev
environment, you would only use 500MB worth of RAM for caching. Not
On 4/19/2013 2:15 AM, David Parks wrote:
Interesting. I'm trying to correlate this new understanding to what I see on
my servers. I've got one server with 5GB dedicated to solr, solr dashboard
reports a 167GB index actually.
When I do many typical queries I see between 3MB and 9MB of disk
Ok, I understand better now.
The Physical Memory is 90% utilized (21.18GB of 23.54GB). Solr has dark grey
allocation of 602MB, and light grey of an additional 108MB, for a JVM total
of 710MB allocated. If I understand correctly, Solr memory utilization is
*not* for caching (unless I configured
hi all, help~~~
how to specify a schema to a collection in solrcloud?
i have a solrcloud with 3 collections, and each configfile is uploaded to zk
like this:
args=-Xmn3000m -Xms5000m -Xmx5000m -XX:MaxPermSize=384m
-Dbootstrap_confdir=/workspace/solr/solrhome/doc/conf
I have plenty of docs and each docs maybe connected to many user-defined tags.I
have used sold-cloud, and use join to do this kind of job,and recently i know
sole-cloud does not support distributed search.AND so this is a big problem so
far.AND the decomposition is quite impossible,because docs
On Fri, 2013-04-19 at 06:51 +0200, Shawn Heisey wrote:
Using SSDs for storage can speed things up dramatically and may reduce
the total memory requirement to some degree,
We have been using SSDs for several years in our servers. It is our
clear experience that to some degree should be replaced
Wow, thank you for those benchmarks Toke, that really gives me some firm
footing to stand on in knowing what to expect and thinking out which path to
venture down. It's tremendously appreciated!
Dave
-Original Message-
From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
Sent:
Ashok:
You really, _really_ need to dive into the admin/analysis page.
That'll show you exactly what WDFF (and all the other elements of your
chain) do to input tokens. Understanding the index and query-time
implications of all the settings in WDFF takes a while.
But from what you're describing,
when i add a schema property to core
core name=pic instanceDir=pic/ loadOnStartup=true transient=false
collection=picCollection
config=solrconfig.xml schema=../picconf/schema.xml/
it seems there a default path to schema ,that is /configs/docconf/
the exception is:
[18:59:09.211]
Hello
Thank you for your answer.
We have solved our problem now. I describe it for someone who could encounter a
similar problem.
Some of our fields are dynamic, and the name of one of these fields was not
correct : it was sent to Solr as a java object, eg
i copy the 3 schema.xml and solrconfig.xml to $solrhome/conf/.xml, and
upload this filedir to zk like this:
args=-Xmn1000m -Xms2000m -Xmx2000m -XX:MaxPermSize=384m
-Dbootstrap_confdir=/home/app/workspace/solrcloud/solr/solrhome/conf
-Dcollection.configName=conf
Hmmm. There has been quite a bit of work lately to support a couple of
things that might be of interest (4.3, which Simon cut today, probably
available to all mid next week at the latest). Basically, you can
choose to pre-define all the cores in solr.xml (so-called old style)
_or_ use the
I'm guessing that your timestamp is a tdate, which stores extra
information in the index for fast range searches. What happens if you
try to facet on just a date field?
Best
Erick
On Thu, Apr 18, 2013 at 8:37 AM, J Mohamed Zahoor zah...@indix.com wrote:
Hi
I am using SOlr 4.1 with 6 shards.
updateLog is _required_ if you're in solrCloud mode. Assuming that
you're not using SolrCloud, then you can freely disable it.
Why do you want to? It's not a bad idea necessarily, but this might be
an XY problem.
Best
Erick
On Thu, Apr 18, 2013 at 10:47 AM, Jamel ESSOUSSI
I am trying to understand update request processor chains. Do they runs one
by one when indexing a ducument? Can I identify multiple update request
processor chains? Also what are that LogUpdateProcessorFactory and
RunUpdateProcessorFactory?
How are you committing data? With 4.0, CommitWithin is now a soft commit,
which means that the transaction log will grow until you do a hard commit.
You need to periodically do a hard commit if you are continually updating
the index. How much updating are you doing?
Also, check how much heap
I m using Solr4.2 , I have changed my text field definition, to use the
Solr.PatternTokenizerFactory instead of Solr.StandardTokenizerFactory , and
changed my schema defination as below
fieldType name=text_token class=solr.TextField
positionIncrementGap=100
analyzer type=index
Faceting on a high cardinality string field, like url, on a 120 million
record index is going to be very memory intensive.
You will very likely need to shard the index to get the performance that
you need.
In Solr 4.2, you can make the url field a Disk based DocValue and shift the
memory from
I want to update(delta-import) one specific item. Is there any query to do
that?
like i can delete specific item with the following query:
localhost:8080/solr/devices/update?stream.body=deletequeryid:46/query/deletecommit=true
Thanks.
--
View this message in context:
Hi,
I'm executing a search including a search for similar documents
(mlt=truemlt.fl=) which works fine so far. I would like to get the
similarity value for each document. I expected this to be quite common and
simple, but I could not find a hint how to do it. Any hint how to do it would
I guess the first thing I'd do is to set maxCollationTries to zero. This
means it will only run your main query once and not re-run it to check the
collations. Now see if your queries have consistent qtime. One easy
explanation is that with maxCollationTries=10, it may be running your query
On 4/19/2013 3:48 AM, David Parks wrote:
The Physical Memory is 90% utilized (21.18GB of 23.54GB). Solr has dark grey
allocation of 602MB, and light grey of an additional 108MB, for a JVM total
of 710MB allocated. If I understand correctly, Solr memory utilization is
*not* for caching (unless
(13/04/19 23:24), Achim Domma wrote:
Hi,
I'm executing a search including a search for similar documents
(mlt=truemlt.fl=) which works fine so far. I would like to get the
similarity value for each document. I expected this to be quite common and simple,
but I could not find a hint how
I want to do a phrase search in solr without analyzers being applied to it
eg - If I search for *DelhiDareDevil* (i.e - with inverted commas)it
should search the exact text and not apply any analyzers or tokenizers on
this field
However if i search for *DelhiDareDevil* it should use tokenizers
On 16 April 2013 11:35, Steve Woodcock steve.woodc...@gmail.com wrote:
We have a simple SolrCloud setup (4.2.1) running with a single shard and
two nodes, and it's working fine except whenever we send an update request,
the leader logs this error:
SEVERE: shard update error StdNode:
On Apr 19, 2013, at 16:59 , vicky desai vicky.de...@germinait.com wrote:
I want to do a phrase search in solr without analyzers being applied to it
eg - If I search for *DelhiDareDevil* (i.e - with inverted commas)it
should search the exact text and not apply any analyzers or tokenizers on
By definition, phrase search is one of two things: 1) match on a string
field literally, or 2) analyze as a sequence of tokens as per the field type
index analyzer.
You could use the keyword tokenizer to store the whole field as one string,
with filtering for the whole string. Or, just make
Is there any documentation that explains pros and cons of using RAID or
different RAIDS?
Oops... that's query analyzer, not index analyzer, so it's:
By definition, phrase search is one of two things: 1) match on a string
field literally, or 2) analyze as a sequence of tokens as per the field type
query analyzer.
-- Jack Krupansky
-Original Message-
From: Jack Krupansky
I want to search so that:
- if i write an alphabet it returns all the items that start with that
alphabet(a returns apple, aspire etc).
- if i ask for a whole string, it returns me just the results with exact
string. (like search for Samsung S3 then only result is samsung s3)
-if i ask for
Joel,
Thanks for your kind reply. The problem is solved with sharding and using
facet.method=enum. I am curious about what's the different between enum
and fc, so that enum works but fc does not. Do you know something about
this?
Thank you!
Regards,
Ming
On Fri, Apr 19, 2013 at 6:18 AM,
Yes, you can do all of that... but it would be a non-trivial amount of
effort - the kind of thing consultants get paid real money to do. You should
also consider doing it in a middleware application layer, using possibly
multiple queries of separate Solr collections. Otherwise, your index might
You can have multiple update chains defined and use only one of them per update
request.
LogUpdateProcessor logs the update request and the RunUpdateProcessor is where
the actual index is updated.
Erik
On Apr 19, 2013, at 07:49 , Furkan KAMACI wrote:
I am trying to understand
Yes, thank you Erick. The analysis/document handlers hold the key to deciding
the type order of the filters to employ given one's document set,
subject matter at hand. The finalized terms they produce for SOLR search,
mlt etc... are crucial to the quality of the results.
- ashok
--
View this
Give us some examples of tokens that you are expecting that pattern to
tokenize. And express the pattern in simple English as well. Some some
actual input data.
I suspect that Solr is working fine - but you may not have precisely
specified your pattern. But we don't know what your pattern is
James,
Thanks for the reply. I see your point and sure enough, reducing
maxCollationTries does reduce time, however may not produce results.
It seems like the time is taken for the collations re-runs. Is there any
way we can activate caching for collations. The same query repeatedly takes
the
I would like to know the answer to this as well.
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appinions.com
Where Influence Isn’t a Game
On Thu, Apr 18, 2013 at 8:15 PM, Manuel Le Normand
you can use the eclipse plugin for zookeeper.
http://www.massedynamic.org/mediawiki/index.php?title=Eclipse_Plug-in_for_ZooKeeper
-Msj.
On Fri, Apr 19, 2013 at 1:53 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
I would like to know the answer to this as well.
Michael
I do not know what it would take to have the collation tests make betetr use of
the QueryResultCache. However, outside of a test scenario, I do not know if
this would help a lot.
Hopefully you wouldn't have a lot of users issuing the exact same query with
the exact same misspelled words over
Right. I am wondering if/how we can download a specific file from the
zookeeper, modify it and then upload to rewrite it. Anyone ?
Thanks,
Ming
On Fri, Apr 19, 2013 at 10:53 AM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
I would like to know the answer to this as well.
I've used zookeeper's cli to do this. I doubt its the right way and I have
no idea if it'll work for clusterstate.json, but it seems to work for
certain things.
cd /opt/zookeeper/bin
./zkCli.sh -server 127.0.0.1:2183 set /configs/collection1/schema.xml `cat
/tmp/newschema.xml`
sleep 10 # give a
Hello,
We are using Solr 3.6.2 single core ( both index and query on same machine)
and randomly the server fails to query correctly. If we query from the
admin console the query is not even applied and it returns numFound count
equal to total docs in the index as if no query is made, and if use
Hi Team,
I am trying to configure the Auto-suggest feature for the businessProvince
field in my schema.
I followed the instructions here:- http://wiki.apache.org/solr/Suggester
But then I got the following error:- INFO: Could not find an instance of
QueryComponent. Disabling collation
How many segments each shard has and what is the reason of running multiple
shards in one machine?
Alex.
-Original Message-
From: qibaoyuan qibaoy...@gmail.com
To: solr-user solr-user@lucene.apache.org
Sent: Fri, Apr 19, 2013 12:26 am
Subject: Re: solr-cloud performance
Hello,
We are using Solr 3.6.2 single core ( both index and query on same machine)
and randomly the server fails to query correctly. If we query from the
admin console the query is not even applied and it returns numFound count
equal to total docs in the index as if no query is made, and if use
: Thanks for your kind reply. The problem is solved with sharding and using
: facet.method=enum. I am curious about what's the different between enum
: and fc, so that enum works but fc does not. Do you know something about
: this?
method=fc/fcs uses the field caches (or uninverted fields
: I am trying to understand update request processor chains. Do they runs one
: by one when indexing a ducument? Can I identify multiple update request
: processor chains? Also what are that LogUpdateProcessorFactory and
: RunUpdateProcessorFactory?
thanks. I was expecting an answer that could help me to choose analyzers or
tokenizers. any help for anyone of the scenarios?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Searching-tp4057328p4057465.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks for detailed answers.
2013/4/19 Chris Hostetter hossman_luc...@fucit.org
: I am trying to understand update request processor chains. Do they runs
one
: by one when indexing a ducument? Can I identify multiple update request
: processor chains? Also what are that
On 4/19/2013 12:55 PM, Ravi Solr wrote:
We are using Solr 3.6.2 single core ( both index and query on same machine)
and randomly the server fails to query correctly. If we query from the
admin console the query is not even applied and it returns numFound count
equal to total docs in the index
I need some explanation on how ValuesSource and related classes work.
There are already implemented ExternalFileField, example on how to load data
from database (
http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-using-external.
html
We had a rogue query take out several replicas in a large 4.2.0 cluster
today, due to OOM's (we use the JVM args to kill the process on OOM).
After recovering, when I execute the match all docs query (*:*), I get a
different count each time.
In other words, if I execute q=*:* several times in a
Hi Maciek,
I think a custom ValueSource is definitely what you want because you
need to compute some derived value based on an indexed field and some
external value.
The trick is figuring how to make the lookup to the external data
very, very fast. Here's a rough sketch of what we do:
We have a
Again, thank you for this incredible information, I feel on much firmer
footing now. I'm going to test distributing this across 10 servers,
borrowing a Hadoop cluster temporarily, and see how it does with enough
memory to have the whole index cached. But I'm thinking that we'll try the
SSD route
Yeah, but as far as I know, there is nothing Solr-specific about that.
See http://www.acnc.com/raid
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Fri, Apr 19, 2013 at 11:19 AM, Furkan KAMACI furkankam...@gmail.com wrote:
Is there any documentation that explains pros and cons
On 19 April 2013 19:50, hassancrowdc hassancrowdc...@gmail.com wrote:
I want to update(delta-import) one specific item. Is there any query to do
that?
No.
Regards,
Gora
64 matches
Mail list logo