Which version of Solr is this on?
On Thu, Jan 14, 2016 at 4:10 PM, Gili Nachum wrote:
> Clarificaiton: If we restart nodes after reloading collection and before
> pausing, then recovery works fine.
>
> On Thu, Jan 14, 2016 at 12:08 PM, Gili Nachum
Jack:
I think that was for faceting? SOLR-8096 maybe?
On Thu, Jan 14, 2016 at 12:25 AM, Toke Eskildsen
wrote:
> On Wed, 2016-01-13 at 15:01 -0700, Anria B. wrote:
>
> [256GB RAM]
>
>> 1. Collection has 20-30 million docs.
>
> Just for completeness: How large is the
You have to prototype. Fortunately you can do that on a very small cluster,
say 2 shards.
Here's the long form:
https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
Best,
Erick
On Thu, Jan 14, 2016 at 4:38 AM, Mugeesh Husain
Yep. Here's Mike's classic video:
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
The third visualization down "TieredMergePolicy" is the default.
Best,
Erick
On Wed, Jan 13, 2016 at 6:52 PM, Zheng Lin Edwin Yeo
wrote:
> Hi Erick,
>
> Thanks
Tell us a lot more. What exact error are you seeing in the Solr log?
On Wed, Jan 13, 2016 at 11:50 PM, Zap Org wrote:
> i have 2 running solr nodes in my cluster one node hot down. i restarted
> tomcat server and its throughing exception for initializing solrconfig.xml
>
The issue linked by Erick is really interesting.
Gia, to answer to your further question :
For such scenario we need to plan the worst case, where everything is lost.
> With Master Slave is just a matter of recreating machines, reconfigure the
> core, and restore a backup, and the game is done,
On 1/14/2016 10:22 AM, Mugeesh Husain wrote:
> I have a question i want to create 2-3 cluster using solrlcoud using single
> zookeeper instance, it is possible ?
Yes, if you use a chroot on the zkHost parameter for each collection.
hello,
I have a question i want to create 2-3 cluster using solrlcoud using single
zookeeper instance, it is possible ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Can-we-create-multiple-cluster-in-single-Zookeeper-instance-tp4250791.html
Sent from the Solr - User
And also, bin/post can be your friend when it comes to troubleshooting or
introspecting Tika parsing via /update/extract. Like this:
$ bin/post -c test -params "extractOnly=true=ruby=yes" -out yes
docs/SYSTEM_REQUIREMENTS.html
java -classpath
hi all,
We did try the q=queryA AND queryB, vs q=queryA=queryB. For all tests,
we commented out caching, and reload core between queries to be ultra sure
that we are getting good comps on time.
we have so many unique Fq and such frequent commits that caches are always
invalidated, so our
hi Mugeesh
It's best to use Zookeeper as it was intended. Install, or run 3 of them
independent of any Solr, then point Solr to the zookeeper cluster.
You can have 1, but then, if anything happens to that 1 single node of
Zookeeper, all of your Solr will be dead, until you can properly revive
On Wed, 2016-01-13 at 15:01 -0700, Anria B. wrote:
[256GB RAM]
> 1. Collection has 20-30 million docs.
Just for completeness: How large is the collection in bytes?
> 2. q=*=someField:SomeVal ---> takes 2.5 seconds
> 3.q=someField:SomeVal --> 300ms
> 4. as numFound -> infinity,
1) point your new solr to the cloud's zookeeper using -DzkHost parameter.
That's all.
2) what is the exact error? Stack trace?
On Thu, 14 Jan 2016, 13:18 Zap Org wrote:
> i have 2 nodes where one got down and after restarting the server it shows
> error in initializing
On Thu, 2016-01-14 at 00:18 +, Lewin Joy (TMS) wrote:
> I am working on Solr 4.10.3 on Cloudera CDH 5.4.4 and am trying to
> group results on a multivalued field, let's say "interests".
...
> But after I just re-indexed the data, it started working.
Grouping is not supposed to be supported
I should add to Erick's point that the test framework allows you to test
HTTP APIs through an embedded Jetty instance, so you should be able to do
anything that you do with a remote Solr instance from code..
On 12 Jan 2016 18:24, "Erick Erickson" wrote:
> And a neater
Hi,
I have following definition for WordDelimiterFilter.
The analysis of 3d shows following four tokens and their positions.
token position
3d 1
3 1
3d 1
d 2
Please help me understand why d is at 2? Should not it also be at position
It's true that SolrCloud is adding some complexity.
But few observations :
SolrCloud has some disadvantages and can't beat the easiness and simpleness
> of
> Master Slave Replica. So I can only encourage to keep Master Slave Replica
> in future versions.
I agree, it can happen situations when
Hi,
Our Solr cluster is running VMs that could freeze for more than the ZK tick
time (it's a non critical CI/CD pipeline running on an overloaded ESX).
When this happens the node's shards will be registered as down. Then when
the node is back recovery takes place, and all shards replicas end up
Hi there,
I installed and started solr following instructions from solr wiki as this
... (on a Redhat server)
cd ~/
tar zxf /tmp/solr-5.3.1.tgz
cd solr-5.3.1/bin
./solr start -f
Solr starts fine. But when opening console in a browser ("
http://server-ip:8983/solr/admin.html;), it shows a
Clarificaiton: If we restart nodes after reloading collection and before
pausing, then recovery works fine.
On Thu, Jan 14, 2016 at 12:08 PM, Gili Nachum wrote:
> Hi,
>
> Our Solr cluster is running VMs that could freeze for more than the ZK
> tick time (it's a non
I've tried out your settings and here's what I get:
3d 1
3 1
d 2
3d 2
1) can you confirm if you've made a typo while typing out your results?
2 ) you'll get the d and 3d as 2 since they're the 2nd token once 3d is
split.
Try the same thing with d3 and you'll get 3 and d3 at position 2
On
Few days ago I had a nullpointer exception with solr 5.4.0 few days ago.
This was the exception.
java.lang.NullPointerException at
org.apache.solr.search.QParser.getParser(QParser.java:315) at
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:159)
at
I agree that SolrCloud has not only advantages, I really understand that it
offers many more features, but it introduces some complexity.
One of the problem I've found is that I've not found a simple way to backup the
content of a collection to restore in situation of disaster recovery. With
Hi Modassar,
Why do you think it should be at position 1? In that case searching for
"3 d" would not find anything. Is it what you expect?
Thanks,
Emir
On 14.01.2016 10:15, Modassar Ather wrote:
Hi,
I have following definition for WordDelimiterFilter.
The analysis of 3d shows following
I've got an extension jar that contains a class which extends from
org.apache.solr.handler.dataimport.DataSource
But it only works if it's within the solr/dist folder. However when stored
in the lib/ folder within Solr home. When it tries to load the class it
cannot find it's parent:
Exception
Irrespective of it what I want to understand why there is an increment in
position. Should not all the terms be at same position as they are yielded
from the same term/token?
No they won't.
The positions are incremented because typically these splits are used in
phrase queries which solr might
>
> In the Linux script is an option called AUTHC_CLIENT_CONFIGURER_ARG but I
> don't find anything similar for Windows...
>
I just realized that this option is not used anyway for the status check
during startup. Any ideas how a solution would look like to make the status
check pass on Windows
Hi,
I have a bid amount of document(billion) , I am looking for how many shard i
have to create in a core ?
As i know capacity of core is 100M( aprox) ?
Is i need to create another core and make distributed search(solrcloud) on
it ?
Actually i looking for a architecture how i should design my
Thanks for your responses.
Why do you think it should be at position 1? In that case searching for "3
d" would not find anything. Is it what you expect?
During search some of the results returned are not wanted. Following is the
example.
Search query: "3d image"
Search results with 3-d image/3 d
Hi,
It seems to me that you don't want to split on numbers. Maybe there are
other cases where you need to so it is turned on. If there are such
cases I would suggest you create test with expectations so you can check
what is best working for you. It is highly likely that you will not be
able
Actually there are situation where a restore is needed, suppose that someone
does some error and deletes all documents from a collection, or maybe deletes a
series of document, etc. I know that this is not likely to happen, but in
mission critical enterprise system, we always need a detailed
hi Callum.
you can create a directory for your jar file any where,and u must set jar
file location in tag in solrConfig.xml
and be carefull that add your lib location at the end of the solr config
default tag,
because some times your jar need class that at first solr must be load own
class after
Hi,
You can contribute to this JIRA issue:
https://issues.apache.org/jira/browse/SOLR-8048
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 14. jan. 2016 kl. 13.02 skrev Kristine Jetzke
> :
>
>>
>> In the Linux script is an option
If I start a backup operation using the location parameter
http://localhost:8983/solr/mycore/replication?command=backup=mycore
ation=z:\temp\backupmycore
How can I monitor when the backup operation is finished? Issuing a standard
details operation
http://localhost:8983/solr/ mycore
re: SolrCloud backup/restore: https://issues.apache.org/jira/browse/SOLR-5750
not committed yet, but getting attention.
On Thu, Jan 14, 2016 at 6:19 AM, Gian Maria Ricci - aka Alkampfer
wrote:
> Actually there are situation where a restore is needed, suppose that
I think the doc is wrong or at least misleading:
https://cwiki.apache.org/confluence/display/solr/Making+and+Restoring+Backups+of+SolrCores
"The backup operation can be monitored to see if it has completed by
sending the details command to the /replication handler..."
>From reading the code, it
Which release of Solr are you using? Last year (or so) there was a Lucene
change that had the effect of keeping all terms for WDF at the same
position. There was also some discussion about whether this was either a
bug or a bug fix, but I don't recall any resolution.
-- Jack Krupansky
On Thu,
No good way except to try them. For getting details on Tika parsing
failures, I much prefer the SolrJ process that the link I sent you
outlines.
Best,
Erick
On Thu, Jan 14, 2016 at 7:52 AM, kostali hassan
wrote:
> thank you Eric I have prb with this files; last
That's what I did:
My solrconfig.xml has the following (i've hardcoded the version numbers for
now to get regexes out of the picture):
No warning's whatsoever for not finding the jars. And the jars themselves
are in the right order (the second depends on the first).
If i move the data import
I suppose that /get is the query by id API. I wonder if its reasonable to
expect it to be smart in SolrCloud usage.
On Thursday, January 14, 2016, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:
> Stupid thought/question. Is there a query by id API that understands
> SolrCloud
> Solr problem. You probably have some kind of system-level classpath
> problem where the wrong version of a critical jar is being used instead
> of the jar that's included with Jetty in the Solr download.
Since our bin/solr script starts Jetty using java -jar, any CLASSPATH
environment
On 1/14/2016 5:20 PM, Shivaji Dutta wrote:
> I am working with a customer that has about a billion documents on 20 shards.
> The documents are extremely small about 100 characters each.
> The insert rate is pretty good, but they are trying to fetch the document by
> using SolrJ SolrQuery
>
>
Stupid thought/question. Is there a query by id API that understands
SolrCloud routing and can simply fwd the query to the shard that would hold
said document? Barring that, can one use SolrJ's routing brains to see what
shard a given id would be routed to and only query that shard?
-Doug
On
Sounds intriguing. It would have to know for sure which query parser is
being used, which might be set in the server side defaults.
Over in Cassandra NoSQL database land we have the concept of "token aware
load balancing policy" on the client side that does the necessary magic
(requiring parsing
Team
Thanks for all the help before.
Current State
I am working with a customer that has about a billion documents on 20 shards.
The documents are extremely small about 100 characters each.
The insert rate is pretty good, but they are trying to fetch the document by
using SolrJ SolrQuery
Add =all to your query to see where the time is spent in the "timing"
section to see which Solr search component is consuming the time.
You may also have to add =track to get the shard-specific info.
In theory, 19 of the shards should return nothing and the 20th will return
a single document.
On 1/14/2016 3:55 AM, Vincenzo D'Amore wrote:
> Few days ago I had a nullpointer exception with solr 5.4.0 few days ago.
>
> This was the exception.
>
> java.lang.NullPointerException at
> org.apache.solr.search.QParser.getParser(QParser.java:315) at
>
Thanks Erick.
On 1/13/16, 10:55 AM, "Erick Erickson" wrote:
>My first thought is "yes, you're overthinking it" ;)
>
>Here's something to get you started for indexing
>through a Java program:
>https://cwiki.apache.org/confluence/display/solr/Using+SolrJ
>
>Of course
Here are some Actual examples, if it helps
wt=json=*:*=on=SolrDocumentType:"invalidValue"=timestamp=0=0=timing
{
"responseHeader": {
"status": 0,
"QTime": 590,
"params": {
"q": "*:*",
"debug": "timing",
"indent": "on",
Thanks for your responses.
It seems to me that you don't want to split on numbers.
It is not with number only. Even if you try to analyze WiFi it will create
4 token one of which will be at position 2. So basically the issue is with
position increment which causes few of the queries behave
Opps. Got omitted.
v4.72. plus it kept reproducing after upgrading to v4.9 (was trying to see
if it was fixed later on).
On Thu, Jan 14, 2016 at 5:26 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> Which version of Solr is this on?
>
> On Thu, Jan 14, 2016 at 4:10 PM, Gili Nachum
On 1/14/2016 5:24 PM, Shawn Heisey wrote:
> That exception, especially given the lack of an error message, is very
> unhelpful. The average person wouldn't be able to deduce that it was a
> config problem.
>
> Perhaps the code in QParser that threw the NPE needs a null check,
> logging/throwing a
hi Shawn
Thanks for your comprehensive answers. I really appreciate it. Just for
clarity, the numbers I posted here were from tests that we isolated only one
single fq and a q. These do have good times, even though its almost 600ms.
Once we are in application mode, and other fq's and facets
On 1/14/2016 12:08 AM, Midas A wrote:
> we are continuously getting the error
> "null:org.eclipse.jetty.io.EofException"
> on slave .
>
> what could be the reason ?
This error is caused by clients that disconnect the HTTP/TCP connection
before Solr has responded to a request. Jetty logs this
On 1/14/2016 12:07 PM, Anria B. wrote:
> Here are some Actual examples, if it helps
>
> wt=json=*:*=on=SolrDocumentType:"invalidValue"=timestamp=0=0=timing
> "QTime": 590,
> Now we wipe out all caches, and put the filter in q.
>
>
Very strange, a fresh install should run without issues. Perhaps Uwe Schindler
can comment on any known bugs in your IBM J9?
If I were you I’d try the following
* Install Oracle Java 8 or OpenJDK 8 and set JAVA_HOME accordingly
* Download Solr 5.4.0
* Unpack and start Solr as before
--
Jan
Here is a stacktrace of when we put a in the autowarming, or in the
"newSearcher" to warm up the collection after a commit.
2016-01-12 19:00:13,216 [http-nio-19082-exec-25
vaultThreadId:http-STAGE-30518-14 vaultSessionId:1E53A095AD22704
vaultNodeId:nodeId:node-2 vaultInstanceId:2228
On 1/14/2016 1:01 PM, Anria B. wrote:
> Here is a stacktrace of when we put a in the autowarming, or in the
> "newSearcher" to warm up the collection after a commit.
> org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Error
> opening new searcher. exceeded limit of
On 1/14/2016 8:03 AM, David Cao wrote:
> The JVM is from IBM based on jre 1.7.
>
> IBM J9 VM (build 2.6, JRE 1.7.0 Linux amd64-64 Compressed References
> 20141216_227497 (JIT enabled, AOT enabled)
>
>
> The box I am using is just a dev vm box, using 'root' is temporary ...
The specific method
Hi Toke,
Thanks for the reply.
But, the grouping on multivalued is working for me even with multiple data in
the multivalued field.
I also tested this on the tutorial collection from the later solr version 5.3.1
, which works as well.
Maybe the wiki needs to be updated?
-Lewin
-Original
Hi Jan,
The JVM is from IBM based on jre 1.7.
IBM J9 VM (build 2.6, JRE 1.7.0 Linux amd64-64 Compressed References
20141216_227497 (JIT enabled, AOT enabled)
The box I am using is just a dev vm box, using 'root' is temporary ...
Thanks
david
On Thu, Jan 14, 2016 at 6:53 AM, David Cao
thank you Eric I have prb with this files; last question how to define or
get the list of files cant be indexing or bad files.
>
>
>
>
That sounds like it. Sorry my memory is so hazy.
Maybe Yonik can either confirm that that Jira is still outstanding or close
it, and confirm if these symptoms are related.
-- Jack Krupansky
On Thu, Jan 14, 2016 at 10:54 AM, Erick Erickson
wrote:
> Jack:
>
> I think
Hi,
I am using Solr 4.10.4, SolrCloud mode (single instance), with the indexes
residing in HDFS. I am currently testing performance and scalability of the
indexing process on my Hadoop cluster using the MapReduceIndexerTool.
Previously, I had been testing on a smaller cluster with 3 datanodes.
On 1/14/2016 5:36 AM, Callum Lamb wrote:
> I've got an extension jar that contains a class which extends from
>
> org.apache.solr.handler.dataimport.DataSource
>
> But it only works if it's within the solr/dist folder. However when stored
> in the lib/ folder within Solr home. When it tries to
65 matches
Mail list logo