On Wed, 2014-03-19 at 22:01 +0100, tradergene wrote:
I have a Solr index with about 32 million docs. Each doc is relatively
small but has multiple dynamic fields that are storing INTs. The initial
problem that I had to resolve is that we were running into OOMs (on a 48GB
heap, 130GB on-disk
Yup!
On Thu, Mar 20, 2014 at 5:13 AM, Otis Gospodnetic
otis.gospodne...@gmail.com wrote:
Hi,
Guessing it's surround query parser's support for within backed by span
queries.
Otis
Solr ElasticSearch Support
http://sematext.com/
On Mar 19, 2014 4:44 PM, T. Kuro Kurosaka
thanks!
On Tue, Mar 18, 2014 at 4:37 PM, Erick Erickson erickerick...@gmail.comwrote:
Avishai:
It sounds like you already understand mmap. Even so you might be
interested in this excellent writeup of MMapDirectory and Lucene by
Uwe:
Is there a way to tell ngramfilterfactory while indexing that number shall
never be tokenized? then the query should be able to find numbers.
Or do i have to change the ngram-min for numbers (not alpha) to 1, if that is
possible? So to speak put the hole number as token and not all possible
Nope. There is no line break in the string and it is not feed from file.
What else could be the reason ?
On 19 March 2014 17:57, Erick Erickson erickerick...@gmail.com wrote:
It looks to me like you're feeding this from some
kind of text file and you really _do_ have a
line break after
The Suggest Search Component that comes preconfigured in Solr 4.7.0
solrconfig.xml seems to thread dump when I call it:
http://localhost:8983/solr/suggest?spellcheck=onq=acwt=jsonindent=true
msg:No suggester named default was configured
Can someone tell me what's going on there?
However,
Hi,
I have a requirement to index a database table with clob content. Each row
in my table a column which is an xml stored as clob. I want to read the
contents of xmlthrough dih and map each of the xml tag to a separate solr
field,
Below is my clob content.
root
authorA/author
Is there a way to tell ngramfilterfactory while indexing that number shall
never be tokenized? then the query should be able to find numbers.
Or do i have to change the ngram-min for numbers (not alpha) to 1, if that
is possible? So to speak put the hole number as token and not all possible
On 20 March 2014 14:53, Prasi S prasi1...@gmail.com wrote:
Hi,
I have a requirement to index a database table with clob content. Each row
in my table a column which is an xml stored as clob. I want to read the
contents of xmlthrough dih and map each of the xml tag to a separate solr
field,
Sathya,
I assume you're using Solr Cloud. Please provide your clusterstate.json while
you're seeing this issue and check your logs for any exceptions. With no
information from you it's hard to troubleshoot any issues!
Thanks,
Greg
On Mar 20, 2014, at 12:44 AM, Sathya
Is there a way to tell ngramfilterfactory while indexing that number shall
never be tokenized? then the query should be able to find numbers.
Or do i have to change the ngram-min for numbers (not alpha) to 1, if that is
possible? So to speak put the hole number as token and not all possible
Hi,
I would like some advice about the best way to bootstrap from scratch a
SolrCloud cluster housing at least two collections with different
sharding/replication setup.
Going through the docs/'Solr In Action' book what I have sees so far is
that there is a way to bootstrap a SolrCloud cluster
Well, the error message really looks like your input is
getting chopped off.
It's vaguely possible that you have some super-low limit
in your servlet container configuration that is only letting very
small packets through.
What I'd do is look in the Solr log file to see exactly what
is coming
Honestly, the best approach is to start with no collections defined and use the
collections api.
If you want to prefconfigure (which has it’s warts and will likely go away as
an option), it’s tricky to do it with different numShards, as that is a global
property per node.
You would basically
You might find this useful:
http://heliosearch.org/solrcloud-assigning-nodes-machines/
It uses the collections API to create your collection with zero
nodes, then shows how to assign your leaders to specific
machines (well, at least specify the nodes the leaders will
be created on, it doesn't
On our drupal multilingual system we use apache Solr 3.5.
The problem is well known on different blogs, sites I read.
The search results are not the one we want.
On our code in hook apachesolr_query_alter we override the defaultOperator:
$query-replaceParam('mm', '90%');
The requirement is, when
Will it work for multi value fields, It is saying that Field Cache will not
work for multi value fields error. Most of the data is multi value fields
in index.
Thanks,
Jilani
On Thu, Mar 20, 2014 at 1:53 AM, Ahmet Arslan iori...@yahoo.com wrote:
Hi,
If you just need counts may be you can
i want the infos simplified so that the user can see why a doc was found
bellow is the output a a doc:
0.085597195 = (MATCH) sum of:
0.083729245 = (MATCH) max of:
0.0019158133 = (MATCH) weight(plain_text:test^10.0 in 601)
[DefaultSimilarity], result of:
0.0019158133 =
Hi Folks,
I am using singles to index bigrams/trigrams. The same is also used
for query in the schema.xml file. But when I run the query in debug mode
for a collections, I dont see the bigrams in the parsed_query . Any idea
what I might be missing.
Thanks Shawn. When we run any solrj application , the below message is
displayed
org.apache.solr.client.solrj.impl.HttpClientUtil createClient
INFO: Creating new http client,
config:maxConnections=128maxConnectionsPerHost=32followRedirects=false
and while restarting solr we are getting this
Please note that although the article talks about the ADDREPLICA command,
that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find
it yet. See https://issues.apache.org/jira/browse/SOLR-5130
On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote:
You might find
Hi,
Please provide some more pointers to go ahead in addressing this.
Thnks,
Jilani
On Thu, Mar 20, 2014 at 8:50 PM, Jilani Shaik jilani24...@gmail.com wrote:
Will it work for multi value fields, It is saying that Field Cache will
not work for multi value fields error. Most of the data is
Hi there
Is there a limit on the # of collections solrcloud can support? Can
zk/solrcloud handle 1000s of collections?
Also i see that the bootup time of solrcloud increases with increase in #
of cores. I do not have any expensive warm up queries. How do i speedup
solr startup?
--
Best
--
C
There are no arbitrary limits on the number of collections but yes
there are practical limits. For example, the cluster state can become
a bottleneck. There is a lot of work happening on finding and
addressing these problems. See
https://issues.apache.org/jira/browse/SOLR-5381
Boot up time is
I'm getting a similar exception when writing documents (on the client
side). I can write one document fine, but the second (which is being
routed to a different shard) generates the error. It happens every time
- definitely not a resource issue or timing problem since this database
is
Thanks, Shalin. Making clusterstate.json on a collection basis sounds
awesome.
I am not having problems with #2 . #3 is a major time hog in my
environment. I have over 300 +collections and restarting the entire cluster
takes in the order of hours. (2-3 hour). Can you explain more about the
Hi,
I suggest you start a new threat describing your use case. Just describe the
problem without assumptions. With a appropriate title/subject.
Ahmet
On Thursday, March 20, 2014 10:01 PM, Jilani Shaik jilani24...@gmail.com
wrote:
Hi,
Please provide some more pointers to go ahead in
How many total replicas are we talking here?
As in how many shards and, for each shard,
how many replicas? I'm not asking for a long list
here, just if you have a bazillion replicas in aggregate.
Hours is surprising.
Best,
Erick
On Thu, Mar 20, 2014 at 2:17 PM, Chris W chris1980@gmail.com
Hours sounds too long indeed. We recently had a client with several
thousand collections, but restart wasn't taking hours...
Otis
Solr ElasticSearch Support
http://sematext.com/
On Mar 20, 2014 5:49 PM, Erick Erickson erickerick...@gmail.com wrote:
How many total replicas are we talking here?
The replication factor is two. I have equally sharded all collections
across all nodes. We have a 6 node cluster setup. 300* 6 shards and 2
replicas per shard. I have almost 600 cores per machine
Also one fact is that my zk timeout is in the order of 2-3 minutes. I see
zk responses very slow and
I need some clarification of how to define explicit mappings in synonyms.txt
file.
I have been using equivalent synonyms for a while and it works as expected.
I am confused with explicit mapping.
I have the below synonyms added to query analyzer.
I want the search on keyword 'watch' to
Hi,
I am looking for some advice to handle large volume of documents with a very
high incoming rate. The size of each document is about 0.5 KB and the incoming
rate could be more than 20K per second and we want to store about one year's
documents in Solr for near real=time searching. The goal
I'm transitioning my index from a 3.x version to 4.6. I'm running a large
heap (20G), primarily to accomodate a large facet cache (~5G), but have
been able to run it on 3.x stably.
On 4.6.0 after stress testing I'm finding that all of my shards are
spending all of their time in GC. After taking
When doing complex boosting/bq we are getting rounding errors on the score.
To get the score to be consistent I needed to use rint on sort:
sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc
str name=p_scorerecip(priority,1,.5,.01)/str
str
Yeah. optimize() also used to come back immediately if the index was
already indexed. It just reopened the index.
We uses to use that for cleaning up the old directories quickly. But now it
does another optimize() even through the index is already optimized.
Very strange.
On Tue, Mar 18, 2014
PLease add me too.
On Tue, Mar 18, 2014 at 8:33 AM, Erick Erickson erickerick...@gmail.comwrote:
Done, thanks!
On Tue, Mar 18, 2014 at 3:54 AM, Anders Gustafsson
anders.gustafs...@pedago.fi wrote:
Yes, please. My Wiki ID is Anders Gustafsson
But yes, please, add the howto to Wiki. You
That's not right. Which Solr versions are you on (question for both
William and Chris)?
On Fri, Mar 21, 2014 at 8:07 AM, William Bell billnb...@gmail.com wrote:
Yeah. optimize() also used to come back immediately if the index was
already indexed. It just reopened the index.
We uses to use
What's your wiki username?
On Fri, Mar 21, 2014 at 8:12 AM, William Bell billnb...@gmail.com wrote:
PLease add me too.
On Tue, Mar 18, 2014 at 8:33 AM, Erick Erickson
erickerick...@gmail.comwrote:
Done, thanks!
On Tue, Mar 18, 2014 at 3:54 AM, Anders Gustafsson
On 3/20/2014 6:54 PM, Harish Agarwal wrote:
I'm transitioning my index from a 3.x version to 4.6. I'm running a large
heap (20G), primarily to accomodate a large facet cache (~5G), but have
been able to run it on 3.x stably.
On 4.6.0 after stress testing I'm finding that all of my shards
39 matches
Mail list logo