Hello searchers,
I did some search for TTL on solr, and found only a way to do it with a
delete-query. But that ~sucks, because you have to do a lot of inserts (and
queries).
The other(kinda better) way to do it, is to set a collection-level ttl, and
when indexes are merged, they will drop the
Thanks a lot, Shawn.
We'll consider your suggestion to tune our solr servers. Will let you know
the result.
Thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-has-a-CPU-spike-when-indexing-a-batch-of-data-tp4309529p4310002.html
Sent from the Solr - User mailing
The primary difference has been solr to solr-cloud in later version,
starting from solr4.0 And what happens if you try starting solr in stand
alone mode, solr cloud does not consider 'core' anymore, it considers
'collection' as param.
On Thu, Dec 15, 2016 at 11:05 PM, Manan Sheth
There's a ecommerce features checklist with what solr can do listed here
https://lucidworks.com/blog/2011/01/25/implementing-the-ecommerce-checklist-with-apache-solr-and-lucidworks/
That should be good start and then there are some more other references
links listed below, I would try all of
Thanks Reth. As noted this is the same map reduce based indexer tool that comes
shipped with the solr distribution by default.
It only take the zk_host details and extracts all required information from
there only. It does not have core specific configurations. The same tool
released with solr
This issue is on solarium-client php code, which is likely not traversing
further to pick results from collation tag of solr response.
at line 190
https://github.com/solariumphp/solarium/blob/master/library/Solarium/QueryType/Suggester/Result/Result.php#L190
verify if this is issue and do pull
Are you indexing xml files through nutch? This exception purely looks like
processing of in-correct format xml file.
On Mon, Dec 12, 2016 at 11:53 AM, KRIS MUSSHORN
wrote:
> ive scoured my nutch and solr config files and I cant find any cause.
> suggestions?
> Monday,
It looks like command line tool that you are using to initiate index
process, is expecting some name to solr-core with respective command line
param. use -help on the command line tool that you are using and check the
solr-core-name parameter key, pass that also with some value.
On Tue, Dec 13,
I think the shard index size is huge and should be split.
On Wed, Dec 14, 2016 at 10:58 AM, Chetas Joshi
wrote:
> Hi everyone,
>
> I am running Solr 5.5.0 on HDFS. It is a solrCloud of 50 nodes and I have
> the following config.
> maxShardsperNode: 1
> replicationFactor:
Thanks for pointing out the java.lang.Character. I did find the existence of
org.apache.lucene.analysis.CharacterUtils, but I was not able to find the
needed methods in it.
Sean
On 12/15/16, 8:58 PM, "Shawn Heisey" wrote:
On 12/15/2016 6:20 PM, Xie, Sean wrote:
If you need the full fidelity solution taking care of multiple
edge-cases, it could be worth looking at commercial solutions.
http://www.basistech.com/ has one, including a free-level SAAS plan.
Regards,
Alex.
http://www.solr-start.com/ - Resources for Solr users, new and experienced
Hi all,
Thanks for the replies,
@eric, ahmet : since those stemmers are logical stemmers it won't work on
words such as caught, ran and so on. So in our case it won't work
@susheel : Yes I thought about it but problems we have is, the documents we
index are some what large text, so copy
On 12/15/2016 6:20 PM, Xie, Sean wrote:
> We have implemented some customized filter/tokenizer, that is using
> org.apache.lucene.analysis.util.CharacterUtils. After upgrading to
> Solr 6.3, the class is no longer available. Is there any reason the
> utility class is removed?
This is not really
Dear user group,
We have implemented some customized filter/tokenizer, that is using
org.apache.lucene.analysis.util.CharacterUtils. After upgrading to Solr 6.3,
the class is no longer available. Is there any reason the utility class is
removed?
What I had to do is copy the class
bq: shouldn't the two replicas have the same number of deletions
Not necessarily. We're back to the fact that commits on the replicas in
a single shard fire at different wall clock times. Plus, when segments
are merged, the deleted docs are purged. So it's quite common that
two replicas in the
Right, so if I'm doing the math right you have 2,400 replicas per JVM?
I'm not clear whether each node has a single JVM or not.
Anyway. 2048 is indeed much too high. If nothing else, dropping it to,
say, 64 would show whether this was the real root of your problem or not.
Even if it slowed
Hi Furkan,
in order to change the BM25 parameter values k1 and b, the following XML
snippet needs to be added in your schema.xml configuration file:
1.3
0.7
It is even possible to specify the SimilarityFactory on individual index
fields. See [1] for more details.
Best
Sascha
[1]
Hi,
bumping my question after 10 days. Any clarification is appreciated.
Best
Sascha
Hi folks,
my Solr index consists of one document with a single valued field "title" of type
"text_general". The title field was index with the content: 1 2 3 4 5 6 7 8 9. The field
type text_general uses
Yes, I changed the value of coreLoadThreads.
With the default value a node takes like 40 minutes to be available with all
replicas up.
Right now I have ~1.2K collections with 12 shards each, 2 replicas spread in 12
nodes. Indeed the value I configured maybe is too much (2048) but I can start
Something I hadn't know until now. The source cdcr collection has 2 shards
with 1 replica, our target cloud has 2 shards with 2 replicas
Both Source and Target have indexes that are not current
Also we have set all of our collections to ignore external commits
On Thu, Dec 15, 2016 at 1:31 PM,
Looking through our replicas I noticed that in one of our shards (each
shard has 2 replicas)
1 replica shows:
"replicas": [
{
"name": "core_node1",
"core": "sial-catalog-material_shard2_replica2",
"baseUrl": "http://ae1b-ecom-msc04:8983/solr;,
"nodeName": "ae1b-ecom-msc04:8983_solr",
"state":
: Well, i can work with this really fine knowing this, but does it make
: sense? I did assume (or be wrong in doing so) that fl=minhash:[binstr]
: should mean get that field and pass it through the transformer. At least
: i just now fell for it, maybe other shouldn't :)
that's what it *can*
I am trying to find the reported inconsistencies now.
The timestamp I have was created by our ETL process, which may not be in
exactly the same order as the indexing occurred
When I tried to sort the results by _docid_ desc, solr through a 500 error:
{ "responseHeader":{ "zkConnected":true,
On 12/15/2016 10:32 AM, tesm...@gmail.com wrote:
> I am getting the following exception while creating a Solr client. Any help
> is appreciated
>
> =This is code snipper to create a SolrClient===
>
> public void populate (String args) throws IOException, SolrServerException
> {
>
On 12/14/2016 7:36 AM, GW wrote:
> I understand accessing solr directly. I'm doing REST calls to a single
> machine.
>
> If I have a cluster of five servers and say three Apache servers, I can
> round robin the REST calls to all five in the cluster?
>
> I guess I'm going to find out. :-) If so I
Hi,
I am getting the following exception while creating a Solr client. Any help
is appreciated
=This is code snipper to create a SolrClient===
public void populate (String args) throws IOException, SolrServerException
{
String urlString = "http://localhost:8983/solr;;
We did extensive comparison in the past for Snowball, KStem and Hunspell
and there are cases where one of them works better but not other or
vice-versa. You may utilise all three of them by having 3 different fields
(fieldTypes) and during query, search in all of them.
For some of the cases where
>
> Interesting I don't recall a bug like that being fixed.
> Anyway, glad it works for you now!
> -Yonik
Then it’s probably because it’s Christmas time! :-)
Hmmm, have you changed coreLoadThreads? We had a problem with this a
while back with loading lots and lots of cores, see:
https://issues.apache.org/jira/browse/SOLR-7280
But that was fixed in 6.2, so unless you changed the number of threads
used to load cores it shouldn't be a problem on 6.3...
Phrase queries and slop and positionIncrementGap ;)
The fieldType has a positionIncrementGap. This is the token delta
between the end token of one entry and the beginning of the next.
so the first entry: IFREMER, Ctr Brest, DRO Geosci Marines, F-29280
Plouzane, France
IFREMER would have a
Hi,
KStemFilter returns legitimate English words, please use it.
Ahmet
On Thursday, December 15, 2016 6:17 PM, Lasitha Wattaladeniya
wrote:
Hello devs,
I'm trying to develop this indexing and querying flow where it converts the
words to its original form (lemmatization).
What about things like PorterStemFilterFactory,
EnglishMinimalStemFilterFactory and the like?
Best,
Erick
On Thu, Dec 15, 2016 at 7:16 AM, Lasitha Wattaladeniya
wrote:
> Hello devs,
>
> I'm trying to develop this indexing and querying flow where it converts the
> words to its
About ten years ago, I accidentally put indexes on an NFS volume. Solr ran
about 100X slower, so I haven’t tried it since.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Dec 15, 2016, at 8:17 AM, Michael Kuhlmann wrote:
>
> Yes,
NFS isn't the first choice. That said, numbers of organizations _doou
have to manally remove _ use NFS for their Lucene indexes. See the
recommendations here:
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/store/NativeFSLockFactory.html
What it really amounts to is that you may find
Thanks Tom,
It looks like there is an PHP extension on Git. seems like a phpized C lib
to create a Zend module to work with ZK. No mention of solr but I'm
guessing I can poll the ensemble for pretty much anything ZK.
Thanks for the direction! A ZK aware app is the way I need to go. I'll give
it
Yes, and we're doing such things at my company. However we most often do
things you shouldn't do; this is one of these.
Solr needs to load data quite fast, otherwise you'll be having a
performance killer. It's often recommended to use an SSD instead of a
normal hard disk; a network share would be
Interesting I don't recall a bug like that being fixed.
Anyway, glad it works for you now!
-Yonik
On Thu, Dec 15, 2016 at 11:01 AM, Chantal Ackermann
wrote:
> Hi Yonik,
>
> after upgrading to Solr 6.3.0, the nested function works as expected! (Both
> with and
Hi Yonik,
after upgrading to Solr 6.3.0, the nested function works as expected! (Both
with and without docValues.)
"facets":{
"count":3179500,
"all_pop":1.5901646171168616E8,
"shop_cat":{
"buckets":[{
"val":"Kontaktlinsen > Torische Linsen",
"count":75168,
Hello all,
Can the Solr indexes be safely stored and used via mounted NFS shares?
-Mike
Hello devs,
I'm trying to develop this indexing and querying flow where it converts the
words to its original form (lemmatization). I was doing bit of research
lately but the information on the internet is very limited. I tried using
hunspellfactory but it doesn't convert the word to it's
Hi Yonik,
are you certain that nesting a function works as documented on
http://yonik.com/solr-subfacets/?
top_authors:{
type: terms,
field: author,
limit: 7,
sort: "revenue desc",
facet:{
revenue: "sum(sales)"
}
}
I’m getting
On Thu, Dec 15, 2016 at 12:37 PM, GW wrote:
> While my client is all PHP it does not use a solr client. I wanted to stay
> with he latest Solt Cloud and the PHP clients all seemed to have some kind
> of issue being unaware of newer Solr Cloud versions. The client makes pure
I think queries would usually not contain more than one phrase per query,
but there isn't a fixed list.
Anyways, your solution is very very good for us. We could write a
QueryParser or a SearchComponent that edits the Lucene Query object in the
ResponseBuilder to include the relevant
Hi Yonik,
here is an update on what I’ve tried so far, unfortunately without any more
luck.
The field directive is (should have included this when asking the question):
/query?
json.facet={
num_pop:{query: "popularity[* TO *]“},
all_pop: "sum(popularity)“,
shop_cat: {type:terms,
Hi,
Span query family would be a pure query-time solution, SpanNotQuery in
particular.
SpanNearQuery include = new SpanTermQuery(new Term(FIELD, "world");
SpanNearQuery exclude = new SpanNearQuery(new SpanQuery[] {
new SpanTermQuery(new Term(FIELD, "hello")),
new SpanTermQuery(new
Hi,
Sole's default similarity is BM25 anymore. Its parameters are defined as
k1=1.2, b=0.75
as default. However is there any way that to check the effect of using
different coefficients to calculate BM25 to find the optimal values?
Kind Regards,
Furkan KAMACI
While my client is all PHP it does not use a solr client. I wanted to stay
with he latest Solt Cloud and the PHP clients all seemed to have some kind
of issue being unaware of newer Solr Cloud versions. The client makes pure
REST calls with Curl. It is stateful through local storage. There is no
You should be able to filter "(word1 in field OR word2 in field) AND
NOT(word1 in field AND word2 in field)". Translate that into the right
syntax.
I don't know if lucene is smart enough to execute the filter only once (it
should be i guess).
Makes sense ?
On Thu, Dec 15, 2016 at 12:12 PM, Leo
Hi,
I have a multivalued field in my schema called "idx_affilliation".
IFREMER, Ctr Brest, DRO Geosci Marines,
F-29280 Plouzane, France.
Univ Lisbon, Ctr Geofis, P-1269102
Lisbon, Portugal.
Univ Bretagne Occidentale, Inst Univ
Europeen Mer, Lab Domaines Ocean, F-29280 Plouzane, France.
Total
Hi,
I'm getting this error in my log
12/15/2016, 9:28:18 AM ERROR true ExecutorUtilUncaught
exception
java.lang.StackOverflowError thrown by thread:
coreZkRegister-1-thread-48-processing-n:XXX.XXX.XXX.XXX:8983_solr
x:collection1_shard3_replica2 s:shard3 c:collection1-visitors
This is happening when heavy indexing like 100/second is going on.
On Thu, Dec 15, 2016 at 4:33 PM, Piyush Kunal
wrote:
> - We have solr6.1.0 cluster running on production with 1 shard and 5
> replicas.
> - Zookeeper quorum on 3 nodes.
> - Using a chroot in zookeeper to
- We have solr6.1.0 cluster running on production with 1 shard and 5
replicas.
- Zookeeper quorum on 3 nodes.
- Using a chroot in zookeeper to segregate the configs from other
collections.
- Using solrj5.1.0 as our client to query solr.
Usually things work fine but on and off we witness this
Hi All,
I am trying to index a HBase table into Solr using HBase indexer and
morphline conf. file.
The issue I'm facing is that, one of the column in HBase table is a count
field (with values as integer) and except this column all other string type
HBase columns are getting indexed in Solr as
See replies inline:
On Wed, Dec 14, 2016 at 3:36 PM, GW wrote:
> Thanks,
>
> I understand accessing solr directly. I'm doing REST calls to a single
> machine.
>
> If I have a cluster of five servers and say three Apache servers, I can
> round robin the REST calls to all
Hi Yonik,
thank you for your quick reply.
(((I just send my original e-mail a second time (I did not confirm the
subscription so I thought it might not have been send the first time, I’m
sorry.
We are using SOLR 6.1.0. Sorry, I should have mentioned.
The low number is because of the test
Hi all,
this is about using a function in nested facets, specifically the „sum()“
function inside a „terms“ facet using the json.facet api.
My json.facet parameter looks like this:
json.facet={shop_cat: {type:terms, field:shop_cat, facet:
{cat_pop:"sum(popularity)"}}}
A snippet of the
56 matches
Mail list logo