Right, if there's no "fixed version" mentioned and if the resolution
is "unresolved", it's not in the code base at all. But that JIRA is
not apparently reproducible, especially on more recent versions that
6.2. Is it possible to test a more recent version (6.6.2 would be my
recommendation).
Erick
Well, you can always manually change the ZK nodes, but whether just
setting a node's state to "leader" in ZK then starting the Solr
instance hosting that node would work... I don't know. Do consider
running CheckIndex on one of the replicas in question first though.
Best,
Erick
On Tue, Nov 21,
My bad. I found it at https://issues.apache.org/jira/browse/SOLR-9453
But I could not find it in changes.txt perhaps because its yet not resolved.
On Tue, Nov 21, 2017 at 9:15 AM, Erick Erickson
wrote:
> Did you check the JIRA list? Or CHANGES.txt in more recent
Hello, Roxana.
You probably looking for TeeSinkTokenFilter, but I believe the idea is
cumbersome to implement in Solr.
Also there is a preanalyzed field which can keep tokenstream in external
form.
Hi,
I have encountered this error during the merging of the 3.5TB of index.
What could be the cause that lead to this?
Exception in thread "main" Exception in thread "Lucene Merge Thread #8"
java.io.
IOException: background merge hit exception: _6f(6.5.1):C7256757
_6e(6.5.1):C646
2072
Hi All - sorry for the repeat, but I'm at a complete loss on this. I
have a collection with 100 shards and 3 replicas each. 6 of the shard
will not elect a leader. I've tried the FORCELEADER command, but
nothing changes.
The log shows 'Force leader attempt 1. Waiting 5 secs for an active
I opened https://issues.apache.org/jira/browse/SOLR-11664 to track this.
I should be able to look into this shortly if no one else does.
-Yonik
On Tue, Nov 21, 2017 at 6:02 PM, Yonik Seeley wrote:
> Thanks for the complete info that allowed me to easily reproduce this!
> The
One other data point I just saw on one of the nodes. It has the
following error:
2017-11-21 22:59:48.886 ERROR
(coreZkRegister-1-thread-1-processing-n:leda:9100_solr) [c:UNCLASS
s:shard14 r:core_node175 x:UNCLASS_shard14_replica3]
o.a.s.c.ShardLeaderElectionContext There was a problem trying
Thanks for the complete info that allowed me to easily reproduce this!
The bug seems to extend beyond hll/unique... I tried min(string_s) and
got wonky results as well.
-Yonik
On Tue, Nov 21, 2017 at 7:47 AM, Volodymyr Rudniev wrote:
> Hello,
>
> I've encountered 2 issues
Thanks Erick!
As I said, user error! ;)
Tom
On 21/11/17 22:41, Erick Erickson wrote:
I think you're confusing shards with replicas.
numShards is 2, each with one replica. Therefore half of your docs
will wind up on one replica and half on the other. If you're adding a
single doc, by
Hi Hendrick - the shards in question have three replicas. I tried
restarting each one (one by one) - no luck. No leader is found. I
deleted one of the replicas and added a new one, and the new one also
shows as 'down'. I also tried the FORCELEADER call, but that had no
effect. I checked
I think you're confusing shards with replicas.
numShards is 2, each with one replica. Therefore half of your docs
will wind up on one replica and half on the other. If you're adding a
single doc, by definition it'll be placed on only one of the two
shards. If your shards had multiple replicas,
I have submitted a patch to make the query generated for overlapping query
terms somewhat configurable (w/ default being SynonymQuery), based on
practices I've seen in the field. I'd love to hear feedback
https://issues.apache.org/jira/browse/SOLR-11662
On Tue, Nov 21, 2017 at 12:37 PM Doug
Thank you Erick. I've set the RamBufferSize to 1G; perhaps higher would
be beneficial. One more data point is that if I restart a node, more
often than not, it goes into recovery, beats up the network for a while,
and then goes green. This happens even if I do no indexing between
restarts.
Hi folks
I can't find an answer to this, and its clearly user error, we have a
collection in solrcloud that is started numShards=2 replicationFactor=1 solr
seems happy the collection seems happy. Yet when we post and update to it and
then look at the record again, it seems to only affect one
bq: We are doing lots of soft commits for NRT search...
It's not surprising that this is slower than local storage, especially
if you have any autowarming going on. Opening new searchers will need
to read data from disk for the new segments, and HDFS may be slower
here.
As far as the commit
We sometimes also have replicas not recovering. If one replica is left
active the easiest is to then to delete the replica and create a new
one. When all replicas are down it helps most of the time to restart one
of the nodes that contains a replica in down state. If that also doesn't
get the
We've never run an index this size in anything but HDFS, so I have no
comparison. What we've been doing is keeping two main collections - all
data, and the last 30 days of data. Then we handle queries based on
date range. The 30 day index is significantly faster.
My main concern right now
We actually also have some performance issue with HDFS at the moment. We
are doing lots of soft commits for NRT search. Those seem to be slower
then with local storage. The investigation is however not really far yet.
We have a setup with 2000 collections, with one shard each and a
Unfortunately I can not upload my cleanup code but the steps I'm doing
are quite easy. I wrote it in Java using the HDFS API and Curator for
ZooKeeper. Steps are:
- read out the children of /collections in ZK so you know all the
collection names
- read /collections//state.json to get
We set the hard commit time long because we were having performance
issues with HDFS, and thought that since the block size is 128M, having
a longer hard commit made sense. That was our hypothesis anyway. Happy
to switch it back and see what happens.
I don't know what caused the cluster to
Hi,
I'm trying to set a replica placement rule on an existing collection
and getting a NPE. It looks like the update code is assuming there's
a current value.
Collection: highspot_test operation: modifycollection
failed:java.lang.NullPointerException
at
Hello all,
I would like to reuse the tokenstream generated in one field, to create a
new tokenstream for another field without executing again the whole
analysis.
The particulate application is:
- I have field *tokens* with an analyzer that generate the tokens (and
maintains the token type
Frankly with HDFS I'm a bit out of my depth so listen to Hendrik ;)...
I need to back up a bit. Once nodes are in this state it's not
surprising that they need to be forcefully killed. I was more thinking
about how they got in this situation in the first place. _Before_ you
get into the nasty
A clever idea. Normally what we do when we need to do a restart, is to
halt indexing, and then wait about 30 minutes. If we do not wait, and
stop the cluster, the default scripts 180 second timeout is not enough
and we'll have lock files to clean up. We use puppet to start and stop
the
Hi,
the write.lock issue I see as well when Solr is not been stopped
gracefully. The write.lock files are then left in the HDFS as they do
not get removed automatically when the client disconnects like a
ephemeral node in ZooKeeper. Unfortunately Solr does also not realize
that it should be
Erick - thank you very much for the reply. I'm still working through
restarting the nodes one by one.
I'm stopping the nodes with the script, but yes - they are being killed
forcefully because they are in this recovery, failed, retry loop. I
could increase the timeout, but they never seem
We help clients that perform index-time semantic expansion to hypernyms at
index time. For example, they will have a synonyms file that does the
following
wing_tips => wing_tips, dress_shoes, shoes
dress_shoes => dress_shoes, shoes
oxfords => oxfords, dress_shoes, shoes
Then at query time, we
On 11/21/2017 9:17 AM, Walter Underwood wrote:
> All our customizations are in solr.in.sh. We’re using the one we configured
> for 6.3.0. I’ll check for any differences between that and the 6.5.1 script.
The order looks correct to me -- the arguments for the OOM killer are
listed *before* the
Did you check the JIRA list? Or CHANGES.txt in more recent versions?
On Tue, Nov 21, 2017 at 1:13 AM, S G wrote:
> Hi,
>
> We are running 6.2 version of Solr and hitting this error frequently.
>
> Error while trying to recover.
Walter:
Yeah, I've seen this on occasion. IIRC, the OOM exception will be
specific to running out of stack space, or at least slightly different
than the "standard" OOM error. That would be the "smoking gun" for too
many threads
Erick
On Tue, Nov 21, 2017 at 9:00 AM, Walter Underwood
How are you stopping Solr? Nodes should not go into recovery on
startup unless Solr was killed un-gracefully (i.e. kill -9 or the
like). If you use the bin/solr script to stop Solr and see a message
about "killing XXX forcefully" then you can lengthen out the time the
script waits for shutdown
One thing you might do is use the termfreq function to see that it
looks like in the index. Also the schema/analysis page will put terms
in "buckets" by power-of-2 so that might help too.
Best,
Erick
On Tue, Nov 21, 2017 at 7:55 AM, Barbet Alain wrote:
> You rock,
I do have one theory about the OOM. The server is running out of memory because
there are too many threads. Instead of queueing up overload in the load
balancer, it is queue in new threads waiting to run. Setting
solr.jetty.threads.max to 10,000 guarantees this will happen under overload.
New
bq: but those use analyzing infix, so they are search indexes, not in-memory
Sure, but they still can consume heap. Most of the index is MMapped of
course, but there are some control structures, indexes and the like
still kept on the heap.
I suppose not using the suggester would nail it though.
We hopefully will switch to Kubernetes/Rancher 2.0 from Rancher
1.x/Docker, soon.
Here are some utilities that we've used as run-once containers to start
everything up:
https://github.com/odoko-devops/solr-utils
Using a single image, run with many different configurations, we have
been able to
All our customizations are in solr.in.sh. We’re using the one we configured for
6.3.0. I’ll check for any differences between that and the 6.5.1 script.
I don’t see any arguments at all in the dashboard. I do see them in a ps
listing, right at the end.
java -server -Xms8g -Xmx8g -XX:+UseG1GC
I am using the IndexMergeTool from Solr, from the command below:
java -classpath lucene-core-6.5.1.jar;lucene-misc-6.5.1.jar
org.apache.lucene.misc.IndexMergeTool
The heap size is 32GB. There are more than 20 million documents in the two
cores.
Regards,
Edwin
On 21 November 2017 at 21:54,
You rock, thank you so much for this clear answer, I loose 2 days for
nothing as I've already the term freq but now I've an answer :-)
(And yes I check it's the doc freq, not the term freq).
Thank you again !
2017-11-21 16:34 GMT+01:00 Emir Arnautović :
> Hi Alain,
Hi Alain,
As explained in prev mail that is doc frequency and each doc is counted once. I
am not sure if Luke can provide you information about overall term frequency -
sum of term frequency of all docs.
Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr &
Hi Alain,
I haven’t been using Luke UI in a while, but if you are talking about top terms
for some field, that might be doc freq, not term freq and every doc is counted
once - that is equivalent to “Load Term Info” in “Schema” in Solr Admin console.
HTH,
Emir
--
Monitoring - Log Management -
$ cat add_test.sh
DATA='
666
toto titi tata toto tutu titi
'
$ sh add_test.sh
0484
$ curl
'http://localhost:8983/solr/alian_test/terms?terms.fl=titi_txt_fr=index'
00
So it's not only on Luke Side, it's come from Solr. Does it sound normal ?
2017-11-21 11:43
Thank you very much for your answer.
It was an error on copy / paste on my mail sorry about that !
So it was already a text field, so omitTermFrequenciesAndPosition was
already on “false”
So I forget my custom analyzer and try to test with an already defined
field_type (text_fr) and see same
On 11/20/2017 9:35 AM, Zheng Lin Edwin Yeo wrote:
Does anyone knows how long usually the merging in Solr will take?
I am currently merging about 3.5TB of data, and it has been running for
more than 28 hours and it is not completed yet. The merging is running on
SSD disk.
The following will
Hi Edwin,
I’ll let somebody with more knowledge about merge to comment merge aspects.
What do you use to merge those cores - merge tool or you run it using Solr’s
core API? What is the heap size? How many documents are in those two cores?
Regards,
Emir
--
Monitoring - Log Management - Alerting -
On 11/20/2017 6:17 PM, Walter Underwood wrote:
When I ran load benchmarks with 6.3.0, an overloaded cluster would get super
slow but keep functioning. With 6.5.1, we hit 100% CPU, then start getting
OOMs. That is really bad, because it means we need to reboot every node in the
cluster.
Also,
Hi Emir,
Thanks for your reply.
There are only 1 host, 1 nodes and 1 shard for these 3.5TB.
The merging has already written the additional 3.5TB to another segment.
However, it is still not a single segment, and the size of the folder where
the merged index is supposed to be is now 4.6TB, This
Hi All - we have a system with 45 physical boxes running solr 6.6.1
using HDFS as the index. The current index size is about 31TBytes.
With 3x replication that takes up 93TBytes of disk. Our main collection
is split across 100 shards with 3 replicas each. The issue that we're
running into
Hello,
I've encountered 2 issues while trying to apply unique()/hll() function to
a string field inside a range facet:
1. Results are incorrect for a single-valued string field.
2. I’m getting ArrayIndexOutOfBoundsException for a multi-valued string
field.
How to reproduce:
1.
I was about to suggest the same , Analysis Panel is the savior in such
cases of doubts.
-Atita
On Tue, Nov 21, 2017 at 7:26 AM, Rick Leir wrote:
> Chirag
> Look in Sor Admin, the Analysis panel. Put spider-man in the left and
> right text inputs, and see how it gets
Chirag
Look in Sor Admin, the Analysis panel. Put spider-man in the left and right
text inputs, and see how it gets analysed. Cheers -- Rick
On November 20, 2017 10:00:49 PM EST, Chirag garg wrote:
>Hi Rick,
>
>Actually my spell field also contains text with hyphen i.e. it
Zara,
If you're looking for custom search components, request handlers or update
processors, you can check out my github repo with examples here:
https://github.com/bdalal/SolrPluginsExamples/
On Tue, Nov 21, 2017 at 3:58 PM Emir Arnautović <
emir.arnauto...@sematext.com> wrote:
> Hi Zara,
>
Hi Alain,
You did not provided definition of used field type - you use “nametext” type
and pasted “text_ami” field type. It is possible that you have
omitTermFrequenciesAndPosition=“true” on nametext field type. The default value
for text fields should be false.
HTH,
Emir
--
Monitoring - Log
Hi,
I build a custom analyzer & setup it in solr, but doesn't work as I expect.
I always get 1 as frequency for each word even if it's present
multiple time in the text.
So I try with default analyzer & find same behavior:
My schema
alian@yoda:~/solr> cat add_test.sh
DATA='
Hi Zara,
What sort of plugins are you trying to build? What sort os issues did you run
into? Maybe you are not too far from having running custom plugin. I would
recommend you try running some of existing plugins as your own - just to make
sure that you are able to build and configure custom
Hi,
I have spent too much time learning plugin for Solr. I am about give up. If
some one has experience writing it. Please contact me. I am open to all
options. I want to learn it at any cost.
Thanks
Zara
Hi,
We are running 6.2 version of Solr and hitting this error frequently.
Error while trying to recover. core=my_core:java.lang.NullPointerException
at org.apache.solr.update.PeerSync.handleUpdates(PeerSync.java:605)
at
Hi Edwin,
How many host/nodes/shard are those 3.5TB? I am not familiar with merge code,
but trying to think what it might include, so don’t take any of following as
ground truth.
Merging for sure will include segments rewrite, so you better have additional
3.5TB if you are merging it to a
58 matches
Mail list logo