Re: Solr with encrypted HDFS

2019-09-11 Thread Hendrik Haddorp
Hi, we have some setups that use an encryption zone in HDFS. Once you have the hdfs config setup the rest is transparent to the client and thus Solr works just fine like that. Said that, we have some general issues with Solr and HDFS. The main problem seems to be around the transaction log

Re: Question: Solr perform well with thousands of replicas?

2019-08-28 Thread Hendrik Haddorp
Hi, we are usually using Solr Clouds with 5 nodes and up to 2000 collections and a replication factor of 2. So we have close to 1000 cores per node. That is on Solr 7.6 but I believe 7.3 worked as well. We tuned a few caches down to a minimum as otherwise the memory usage goes up a lot. The Solr

NullPointerException in QueryComponent.unmarshalSortValues

2019-06-07 Thread Hendrik Haddorp
Hi, I'm doing a simple *:* search on an empty multi sharded collection using Solr 7.6 and am getting this exception: NullPointerException     at org.apache.solr.handler.component.QueryComponent.unmarshalSortValues(QueryComponent.java:1034)     at

Re: Status of solR / HDFS-v3 compatibility

2019-05-03 Thread Hendrik Haddorp
We have some Solr 7.6 setups connecting to HDFS 3 clusters. So far that did not show any compatibility problems. On 02.05.19 15:37, Kevin Risden wrote: For Apache Solr 7.x or older yes - Apache Hadoop 2.x was the dependency. Apache Solr 8.0+ has Hadoop 3 compatibility with SOLR-9515. I did some

Re: NPE deleting expired docs (SOLR-13281)

2019-03-13 Thread Hendrik Haddorp
We have the same issue on Solr 7.6. On 12.03.2019 16:05, Gerald Bonfiglio wrote: Has anyone else observed NPEs attempting to have expired docs removed? I'm seeing the following exceptions: 2019-02-28 04:06:34.849 ERROR (autoExpireDocs-30-thread-1) [ ]

Re: Increasing solr nodes

2019-02-12 Thread Hendrik Haddorp
You can use the MOVEREPLICA command: https://lucene.apache.org/solr/guide/7_6/collections-api.html Alternately you can also add another replica and then remove one of your old replicas. When you a replica you can either specify the node it shall be placed on or let Solr pick a node for you.

Re: COLLECTION CREATE and CLUSTERSTATUS changes in SOLR 7.5.0

2019-02-10 Thread Hendrik Haddorp
Do you have something about legacyCloud in your CLUSTERSTATUS response? I have "properties":{"legacyCloud":"false"} In the legacy cloud mode, also calles format 1, the state is stored in a central clusterstate.js node in ZK, which does not scale well. In the modern mode every collection has its

Re: Solr moved all replicas from node

2019-02-10 Thread Hendrik Haddorp
I opened https://issues.apache.org/jira/browse/SOLR-13240 for the exception. On 10.02.2019 01:35, Hendrik Haddorp wrote: Hi, I have two Solr clouds using Version 7.6.0 with 4 nodes each and about 500 collections with one shard and a replication factor of 2 per Solr cloud. The data is stored

Re: CloudSolrClient getDocCollection

2019-02-10 Thread Hendrik Haddorp
, 2019 at 5:23 PM Hendrik Haddorp wrote: Hi Jason, thanks for your answer. Yes, you would need one watch per state.json and thus one watch per collection. That should however not really be a problem with ZK. I would assume that the Solr server instances need to monitor those nodes to be up

Re: Solr moved all replicas from node

2019-02-10 Thread Hendrik Haddorp
m":{ "beforeAction":[], "afterAction":[], "stage":["STARTED", "ABORTED", "SUCCEEDED", "FAILED", "BEFORE_ACTION", "AFTER_ACTION", "IGNO

Solr moved all replicas from node

2019-02-09 Thread Hendrik Haddorp
Hi, I have two Solr clouds using Version 7.6.0 with 4 nodes each and about 500 collections with one shard and a replication factor of 2 per Solr cloud. The data is stored in the HDFS. I restarted the nodes one by one and always waited for the replicas to fully recover before I restarted the

Re: CloudSolrClient getDocCollection

2019-02-08 Thread Hendrik Haddorp
lection use case. The client would need to recalculate its state information for _all_ collections any time that _any_ of the collections changed, since it has no way to tell which collection was changed.) Best, Jason On Thu, Feb 7, 2019 at 11:44 AM Hendrik Haddorp wrote: Hi, when I p

CloudSolrClient getDocCollection

2019-02-07 Thread Hendrik Haddorp
Hi, when I perform a query using the CloudSolrClient the code first retrieves the DocCollection to determine to which instance the query should be send [1]. getDocCollection [2] does a lookup in a cache, which has a 60s expiration time [3]. When a DocCollection has to be reloaded this is

Re: Large Number of Collections takes down Solr 7.3

2019-01-29 Thread Hendrik Haddorp
How much memory do the Solr instances have? Any more details on what happens when the Solr instances start to fail? We are using multiple Solr clouds to keep the collection count low(er). On 29.01.2019 06:53, Gus Heck wrote: Does it all have to be in a single cloud? On Mon, Jan 28, 2019,

Re: SolrCloud recovery

2019-01-25 Thread Hendrik Haddorp
t's not relevant if the replica in recovery belongs to a shard that already has a leader, but if you restart your entire cluster it can come into play. Best, Erick On Fri, Jan 25, 2019 at 3:32 AM Hendrik Haddorp wrote: Thanks, that sounds good. Didn't know that parameter. On 25.01.2019 11:23, Vadim Ivano

Re: SolrCloud recovery

2019-01-25 Thread Hendrik Haddorp
-Original Message- From: Hendrik Haddorp [mailto:hendrik.hadd...@gmx.net] Sent: Friday, January 25, 2019 11:39 AM To: solr-user@lucene.apache.org Subject: SolrCloud recovery Hi, I have a SolrCloud with many collections. When I restart an instance and the replicas are recovering I noticed

SolrCloud recovery

2019-01-25 Thread Hendrik Haddorp
Hi, I have a SolrCloud with many collections. When I restart an instance and the replicas are recovering I noticed that number replicas recovering at one point is usually around 5. This results in the recovery to take rather long. Is there a configuration option that controls how many

Re: Solr index writing to s3

2019-01-16 Thread Hendrik Haddorp
Theoretically you should be able to use the HDFS backend, which you can configure to use s3. Last time I tried that it did however not work for some reason. Here is an example for that, which also seems to have ultimately failed:

Re: Improve indexing speed?

2019-01-01 Thread Hendrik Haddorp
How are you indexing the documents? Are you using SolrJ or the plain REST API? Are you sending the documents one by one or all in one request? The performance is far better if you send the 100 documents in one request. If you send them individual, are you doing any commits between them?

Re: solr is using TLS1.0

2018-11-21 Thread Hendrik Haddorp
Hi Anchal, the IBM JVM behaves differently in the TLS setup then the Oracle JVM. If you search for IBM Java TLS 1.2 you find tons of reports of problems with that. In most cases you can get around that using the system property "com.ibm.jsse2.overrideDefaultTLS" as documented here:

Re: Solr JVM Memory settings

2018-10-15 Thread Hendrik Haddorp
. On 12.10.2018 19:59, Christopher Schultz wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hendrik, On 10/12/18 02:36, Hendrik Haddorp wrote: Those constraints can be easily set if you are using Docker. The problem is however that at least up to Oracle Java 8, and I believe quite a bit further

Re: Solr JVM Memory settings

2018-10-12 Thread Hendrik Haddorp
Those constraints can be easily set if you are using Docker. The problem is however that at least up to Oracle Java 8, and I believe quite a bit further, the JVM is not at all aware about those limits. That's why when running Solr in Docker you really need to make sure that you set the memory

Re: Solr JVM Memory settings

2018-10-11 Thread Hendrik Haddorp
Beside the heap the JVM has other memory areas, like the metaspace: https://docs.oracle.com/javase/9/tools/java.htm -> MaxMetaspaceSize search for "size" in that document and you'll find tons of further settings. I have not tried out Oracle Java 9 yet. regards, Hendrik On 11.10.2018 06:08,

deprecated field types

2018-08-06 Thread Hendrik Haddorp
Hi, the Solr documentation lists deprecated field types at: https://lucene.apache.org/solr/guide/7_4/field-types-included-with-solr.html Below the table the following is stated: /All Trie* numeric and date field types have been deprecated in favor of *Point field types. Point field types are

NullPointerException in SolrMetricManager

2018-07-31 Thread Hendrik Haddorp
Hi, we are seeing the following NPE sometimes when we delete a collection right after we modify the schema: 08:47:46.407 [zkCallback-5-thread-4] INFO org.apache.solr.rest.ManagedResource 209 processStoredData - Loaded initArgs {ignoreCase=true} for /schema/analysis/stopwords/text_ar

Re: SolrJ and autoscaling

2018-06-08 Thread Hendrik Haddorp
, Jun 6, 2018 at 8:33 PM, Hendrik Haddorp wrote: Hi, I'm trying to read and modify the autoscaling config. The API on https://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling-api.html does only mention the REST API. The read part does however also work via SolrJ

Re: Running Solr on HDFS - Disk space

2018-06-07 Thread Hendrik Haddorp
The only option should be to configure Solr to just have a replication factor of 1 or HDFS to have no replication. I would go for the middle and configure both to use a factor of 2. This way a single failure in HDFS and Solr is not a problem. While in 1/3 or 3/1 option a single server error

SolrJ and autoscaling

2018-06-06 Thread Hendrik Haddorp
Hi, I'm trying to read and modify the autoscaling config. The API on https://lucene.apache.org/solr/guide/7_3/solrcloud-autoscaling-api.html does only mention the REST API. The read part does however also work via SolrJ:     cloudSolrClient.getZkStateReader().getAutoScalingConfig() Just

managed resources and SolrJ

2018-05-08 Thread Hendrik Haddorp
Hi, we are looking into using manged resources for synonyms via the ManagedSynonymGraphFilterFactory. It seems like there is no SolrJ API for that. I would be especially interested in one via the CloudSolrClient. I found

Re: collection properties

2018-04-14 Thread Hendrik Haddorp
I opened SOLR-12224 for this: https://issues.apache.org/jira/browse/SOLR-12224 On 14.04.2018 01:49, Shawn Heisey wrote: On 4/13/2018 5:07 PM, Tomás Fernández Löbbe wrote: Yes... Unfortunately there is no GET API :S Can you open a Jira? Patch should be trivial My suggestion would be to return

collection properties

2018-04-13 Thread Hendrik Haddorp
Hi, with Solr 7.3 it is possible to set arbitrary collection properties using https://lucene.apache.org/solr/guide/7_3/collections-api.html#collectionprop But how do I read out the properties again? So far I could not find a REST call that would return the properties. I do see my property in

Re: in-place updates

2018-04-12 Thread Hendrik Haddorp
1 Apr 2018, at 07:34, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi, in http://lucene.472066.n3.nabble.com/In-Place-Updates-not-working-as-expected-tp4375621p4380035.html some restrictions on the supported fields are given. I could however not find if in-place updates are supported for

in-place updates

2018-04-10 Thread Hendrik Haddorp
Hi, in http://lucene.472066.n3.nabble.com/In-Place-Updates-not-working-as-expected-tp4375621p4380035.html some restrictions on the supported fields are given. I could however not find if in-place updates are supported for are field types or if they only work for say numeric fields. thanks,

Re: Problem accessing /solr/_shard1_replica_n1/get

2018-03-24 Thread Hendrik Haddorp
that my nodes can restart before the replicas get moved. Maybe that does then also resolve this type of problem. Issue SOLR-12114 does make changing the config a bit more tricky though but I got it updated. thanks, Hendrik On 24.03.2018 18:31, Shawn Heisey wrote: On 3/24/2018 11:22 AM, Hendrik

Re: Problem accessing /solr/_shard1_replica_n1/get

2018-03-24 Thread Hendrik Haddorp
)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)     at java.lang.Thread.run(Thread.java:748) On 24.03.2018 03:52, Shawn Heisey wrote: On 3/23/2018 4:08 AM, Hendrik Haddorp wrote: I did not define a /get request handler but I also don't see one being default

Problem accessing /solr/_shard1_replica_n1/get

2018-03-23 Thread Hendrik Haddorp
Hi, I have a Solr Cloud 7.2.1 setup and used SolrJ (7.2.1) to create 1000 collections with a few documents. During that I got multiple times in the Solr logs exceptions because an access of the /get handler of a collection failed. The call stack looks like this:     at

Re: collection reload leads to OutOfMemoryError

2018-03-18 Thread Hendrik Haddorp
. Shouldn't all collections be loaded during the startup? On 18.03.2018 17:22, Hendrik Haddorp wrote: Hi, I did a simple test on a three node cluster using Solr 7.2.1. The JVMs (Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 1.8.0_162 25.162-b12) have about 6.5GB heap and 1.5GB metaspace

collection reload leads to OutOfMemoryError

2018-03-18 Thread Hendrik Haddorp
Hi, I did a simple test on a three node cluster using Solr 7.2.1. The JVMs (Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 1.8.0_162 25.162-b12) have about 6.5GB heap and 1.5GB metaspace. In my test I have 1000 collections with only 1000 simple documents each. I'm then triggering

Re: Solr on DC/OS ?

2018-03-15 Thread Hendrik Haddorp
Hi, we are running Solr on Marathon/Mesos, which should basically be the same as DC/OS. Solr and ZooKeeper are running in docker containers. I wrote my own Mesos framework that handles the assignment to the agents. There is a public sample that does the same for ElasticSearch. I'm not aware

Re: SolrCloud update and luceneMatchVersion

2018-03-14 Thread Hendrik Haddorp
Thanks for the detailed description! On 14.03.2018 16:11, Shawn Heisey wrote: On 3/14/2018 5:56 AM, Hendrik Haddorp wrote: So you are saying that we do not need to run the IndexUpgrader tool if we move from 6 to 7. Will the index be then updated automatically or will we get a problem once we

Re: SolrCloud update and luceneMatchVersion

2018-03-14 Thread Hendrik Haddorp
? On 14.03.2018 11:14, Shawn Heisey wrote: On 3/14/2018 3:04 AM, Hendrik Haddorp wrote: we have a SolrCloud 6.3 with HDFS setup and plan to upgrade to 7.2.1. The cluster upgrade instructions on https://lucene.apache.org/solr/guide/7_2/upgrading-a-solr-cluster.html does not contain any

SolrCloud update and luceneMatchVersion

2018-03-14 Thread Hendrik Haddorp
Hi, we have a SolrCloud 6.3 with HDFS setup and plan to upgrade to 7.2.1. The cluster upgrade instructions on https://lucene.apache.org/solr/guide/7_2/upgrading-a-solr-cluster.html does not contain any information on changing the luceneMatchVersion. If we change the luceneMatchVersion

Re: CLUSTERSTATUS API and Error loading specified collection / config in Solr 5.3.2.

2018-03-12 Thread Hendrik Haddorp
Hi, are your collections using stateFormat 1 or 2? In version 1 all state was stored in one file while in version 2 each collection has its own state.json. I assume that in the old version it could happen that the common file still contains state for a collection that was deleted. So I would

MODIFYCOLLECTION via Solrj

2018-02-07 Thread Hendrik Haddorp
Hi, I'm unable to find how I can do a MODIFYCOLLECTION via Solrj. I would like to change the replication factor of a collection but can't find it in the Solrj API. Is that not supported? regards, Hendrik

HDFS replication factor

2018-01-27 Thread Hendrik Haddorp
Hi, when I configure my HDFS setup to use a specific replication factor, like 1, this only effects the index files that Solr writes. The write.lock files and backups are being created with a different replication factor. The reason for this should be that HdfsFileWriter is loading the

Re: SolrJ with Async Http Client

2018-01-03 Thread Hendrik Haddorp
There is asynchronous and non-blocking. If I use 100 threads to perform calls to Solr using the standard Java HTTP client or SolrJ I block 100 threads even if I don't block my program logic threads by using async calls. However if I perform those HTTP calls using a non-blocking HTTP client,

Re: request dependent analyzer

2017-12-18 Thread Hendrik Haddorp
Hi, how do multiple analyzers help? On 18.12.2017 10:25, Markus Jelsma wrote: Hi - That is impossible. But you can construct many analyzers instead. -Original message- From:Hendrik Haddorp Sent: Monday 18th December 2017 8:35 To: solr-user

request dependent analyzer

2017-12-17 Thread Hendrik Haddorp
Hi, currently we use a lot of small collections that all basically have the same schema. This does not scale too well. So we are looking into combining multiple collections into one. We would however like some analyzers to behave slightly differently depending on the logical collection. We

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-09 Thread Hendrik Haddorp
the NoLockFactory you could specify. That would allow you to share a common index, woe be unto you if you start updating the index though. Best, Erick On Sat, Dec 9, 2017 at 4:46 AM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi, for the HDFS case wouldn't it be nice if there was a mode in

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-12-09 Thread Hendrik Haddorp
Hi, for the HDFS case wouldn't it be nice if there was a mode in which the replicas just read the same index files as the leader? I mean after all the data is already on a shared readable file system so why would one even need to replicate the transaction log files? regards, Hendrik On

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Hendrik Haddorp
know what are the factors influence and what considerations are to be taken in relation to this? Thanks On Wed, 22 Nov 2017 at 14:16 Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: We did some testing and the performance was strangely even better with HDFS then the with the local fil

Re: Solr on HDFS vs local storage - Benchmarking

2017-11-22 Thread Hendrik Haddorp
We did some testing and the performance was strangely even better with HDFS then the with the local file system. But this seems to greatly depend on how your setup looks like and what actions you perform. We now had a patter with lots of small updates and commits and that seems to be quite a

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-22 Thread Hendrik Haddorp
.getData(SolrZkClient.java:354)     at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:1021)     ... 9 more Can I modify zookeeper to force a leader?  Is there any other way to recover from this?  Thanks very much! -Joe On 11/21/2017 3:24 PM, Hendrik Haddorp wrote: W

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Hendrik Haddorp
! -joe On 11/21/2017 2:35 PM, Hendrik Haddorp wrote: We actually also have some performance issue with HDFS at the moment. We are doing lots of soft commits for NRT search. Those seem to be slower then with local storage. The investigation is however not really far yet. We have a setup

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Hendrik Haddorp
don't want to issue the manual commit. Best, Erick On Tue, Nov 21, 2017 at 10:34 AM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi, the write.lock issue I see as well when Solr is not been stopped gracefully. The write.lock files are then left in the HDFS as they do not get removed aut

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Hendrik Haddorp
eader assigned: http://lovehorsepower.com/SolrClusterErrors.jpg -Joe On 11/21/2017 1:34 PM, Hendrik Haddorp wrote: Hi, the write.lock issue I see as well when Solr is not been stopped gracefully. The write.lock files are then left in the HDFS as they do not get removed automatically when the client disc

Re: Recovery Issue - Solr 6.6.1 and HDFS

2017-11-21 Thread Hendrik Haddorp
Hi, the write.lock issue I see as well when Solr is not been stopped gracefully. The write.lock files are then left in the HDFS as they do not get removed automatically when the client disconnects like a ephemeral node in ZooKeeper. Unfortunately Solr does also not realize that it should be

Re: SolrJ DocCollection is missing config name

2017-11-12 Thread Hendrik Haddorp
An option is actually to do an explicit ClusterStatus.getClusterStatus().process(solr, collectionName) request and then get the config set name out of the result. This is a bit cumbersome but works. On 12.11.2017 19:54, Hendrik Haddorp wrote: Hi, the SolrJ DocCollection object seems

SolrJ DocCollection is missing config name

2017-11-12 Thread Hendrik Haddorp
Hi, the SolrJ DocCollection object seems to contain all information from the cluster status except the name of the config set. Is that a bug or on purpose? The reason might be that everything in the DocCollection object originates from the state.json while the config set name is stored in

Re: solr core replication

2017-10-23 Thread Hendrik Haddorp
this hanging around I'd guess. Best, Erick On Thu, Oct 19, 2017 at 11:55 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi Erick, that is actually the call I'm using :-) If you invoke http://solr_target_machine:port/solr/core/replication?command=details after that you can see the repli

Re: solr core replication

2017-10-20 Thread Hendrik Haddorp
shut down the target cluster and just copy the entire data dir from each source replica to each target replica then start all the target Solr instances up you'll be fine. Best, Erick On Thu, Oct 19, 2017 at 1:33 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi, I want to transfer

solr core replication

2017-10-19 Thread Hendrik Haddorp
Hi, I want to transfer a Solr collection from one SolrCloud to another one. For that I create a collection in the target cloud using the same config set as on the source cloud but with a replication factor of one. After that I'm using the Solr core API with a "replication?command=fetchindex"

Re: streaming with SolrJ

2017-09-28 Thread Hendrik Haddorp
t;select(search(gettingstarted,\n" + "q=*:* NOT personal_email_s:*,\n" + "fl=\"id,business_email_s\",\n" + "sort=\"business_email_s asc\"),\n" + "id,\n" + "b

streaming with SolrJ

2017-09-28 Thread Hendrik Haddorp
Hi, I'm trying to use the streaming API via SolrJ but have some trouble with the documentation and samples. In the reference guide I found the below example in http://lucene.apache.org/solr/guide/6_6/streaming-expressions.html. Problem is that "withStreamFunction" does not seem to exist.

Re: generate field name in query

2017-09-13 Thread Hendrik Haddorp
You should be able to just use price_owner_float:[100 TO 200] OR price_customer_float:[100 TO 200] If the document doesn't have the field the condition is false. On 12.09.2017 23:14, xdzgor1 wrote: Rick Leir-2 wrote Peter The common setup is to use copyfield from all your fields into a 'grab

Re: Solr memory leak

2017-09-10 Thread Hendrik Haddorp
e a couple of options: 1> agitate fo ra 6.6.2 with this included 2> apply the patch yourself and compile it locally Best, Erick On Sun, Sep 10, 2017 at 6:04 AM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi, looks like SOLR-10506 didn't make it into 6.6.1. I do however

Re: Solr memory leak

2017-09-10 Thread Hendrik Haddorp
Hi, looks like SOLR-10506 didn't make it into 6.6.1. I do however also not see it listen in the current release notes for 6.7 nor 7.0: https://issues.apache.org/jira/projects/SOLR/versions/12340568 https://issues.apache.org/jira/projects/SOLR/versions/12335718 Is there any any rough

Re: Solr memory leak

2017-08-30 Thread Hendrik Haddorp
Did you get an answer? Would really be nice to have that in the next release. On 28.08.2017 18:31, Erick Erickson wrote: Varun Thacker is the RM for Solr 6.6.1, I've pinged him about including it. On Mon, Aug 28, 2017 at 8:52 AM, Walter Underwood wrote: That would be

Solr memory leak

2017-08-28 Thread Hendrik Haddorp
Hi, we noticed that triggering collection reloads on many collections has a good chance to result in an OOM-Error. To investigate that further I did a simple test: - Start solr with a 2GB heap and 1GB Metaspace - create a trivial collection with a few documents (I used only 2 fields

Re: 700k entries in overseer q cannot addreplica or deletereplica

2017-08-22 Thread Hendrik Haddorp
17 at 1:14 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: - stop all solr nodes - start zk with the new jute.maxbuffer setting - start a zk client, like zkCli, with the changed jute.maxbuffer setting and check that you can read out the overseer queue - clear the queue - restart zk

Re: 700k entries in overseer q cannot addreplica or deletereplica

2017-08-22 Thread Hendrik Haddorp
Courtade wrote: I set jute.maxbuffer on the so hosts should this be done to solr as well? Mine is happening in a severely memory constrained end as well. Jeff Courtade M: 240.507.6116 On Aug 22, 2017 8:53 AM, "Hendrik Haddorp" <hendrik.hadd...@gmx.net> wrote: We have Sol

Re: 700k entries in overseer q cannot addreplica or deletereplica

2017-08-22 Thread Hendrik Haddorp
are the zookeeper servers residing on solr nodes? Are the solr nodes underpowered ram and or cpu? Jeff Courtade M: 240.507.6116 On Aug 22, 2017 8:30 AM, "Hendrik Haddorp" <hendrik.hadd...@gmx.net> wrote: I'm always using a small Java program to delete the nodes directly. I ass

Re: 700k entries in overseer q cannot addreplica or deletereplica

2017-08-22 Thread Hendrik Haddorp
/overseer/queue Or do i need to delete individual entries? Will rmr /overseer/queue/* work? Jeff Courtade M: 240.507.6116 On Aug 22, 2017 8:20 AM, "Hendrik Haddorp" <hendrik.hadd...@gmx.net> wrote: When Solr is stopped it did not cause a problem so far. I cleared the queue

Re: 700k entries in overseer q cannot addreplica or deletereplica

2017-08-22 Thread Hendrik Haddorp
17 8:01 AM, "Hendrik Haddorp" <hendrik.hadd...@gmx.net> wrote: Hi Jeff, we ran into that a few times already. We have lots of collections and when nodes get started too fast the overseer queue grows faster then Solr can process it. At some point Solr tries to redo things like leader

Re: 700k entries in overseer q cannot addreplica or deletereplica

2017-08-22 Thread Hendrik Haddorp
Hi Jeff, we ran into that a few times already. We have lots of collections and when nodes get started too fast the overseer queue grows faster then Solr can process it. At some point Solr tries to redo things like leaders votes and adds new tasks to the list, which then gets longer and

Re: atomic updates in conjunction with optimistic concurrency

2017-07-21 Thread Hendrik Haddorp
; updateRequest = new UpdateRequest(); updateRequest.add(docs); client.request(updateRequest, collection); updateRequest = new UpdateRequest(); updateRequest.commit(client, collection); } Maybe you can let us know more details how the update been made? Amrit Sarkar Searc

Re: atomic updates in conjunction with optimistic concurrency

2017-07-21 Thread Hendrik Haddorp
witter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2 On Fri, Jul 21, 2017 at 9:50 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi, when I try to use an atomic update in conjunction with optimistic concurrency Solr sometimes complains that the version I passed i

atomic updates in conjunction with optimistic concurrency

2017-07-21 Thread Hendrik Haddorp
Hi, when I try to use an atomic update in conjunction with optimistic concurrency Solr sometimes complains that the version I passed in does not match. The version in my request however match to what is stored and what the exception states as the actual version does not exist in the

Re: finds all documents without a value for field

2017-07-20 Thread Hendrik Haddorp
If the range query is so much better shouldn't the Solr query parser create a range query for a token query that only contains the wildcard? For the *:* case it does already contain a special path. On 20.07.2017 21:00, Shawn Heisey wrote: On 7/20/2017 7:20 AM, Hendrik Haddorp wrote: the Solr

Re: finds all documents without a value for field

2017-07-20 Thread Hendrik Haddorp
forgot the link with the statement: https://lucene.apache.org/solr/guide/6_6/the-standard-query-parser.html On 20.07.2017 15:20, Hendrik Haddorp wrote: Hi, the Solr 6.6. ref guide states that to "finds all documents without a value for field" you can use: -field:[* TO *] While th

finds all documents without a value for field

2017-07-20 Thread Hendrik Haddorp
Hi, the Solr 6.6. ref guide states that to "finds all documents without a value for field" you can use: -field:[* TO *] While this is true I'm wondering why it is recommended to use a range query instead of simply: -field:* regards, Hendrik

query rewriting

2017-03-05 Thread Hendrik Haddorp
Hi, I would like to dynamically modify a query, for example by replacing a field name with a different one. Given how complex the query parsing is it does look error prone to duplicate that so I would like to work on the Lucene Query object model instead. The subclasses of Query look

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-22 Thread Hendrik Haddorp
of date. Erick On Tue, Feb 21, 2017 at 10:30 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi Erick, in the none HDFS case that sounds logical but in the HDFS case all the index data is in the shared HDFS file system. Even the transaction logs should be in there. So the node tha

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Hendrik Haddorp
replica, possibly using very old data. FWIW, Erick On Tue, Feb 21, 2017 at 1:12 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: Hi, I had opened SOLR-10092 (https://issues.apache.org/jira/browse/SOLR-10092) for this a while ago. I was now able to gt this feature working with a very smal

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Hendrik Haddorp
. Not really sure why one replica needs to be up though. I added the patch based on Solr 6.3 to the bug report. Would be great if it could be merged soon. regards, Hendrik On 19.01.2017 17:08, Hendrik Haddorp wrote: HDFS is like a shared filesystem so every Solr Cloud instance can access the data

Re: 6.4.0 collection leader election and recovery issues

2017-02-02 Thread Hendrik Haddorp
Might be that your overseer queue overloaded. Similar to what is described here: https://support.lucidworks.com/hc/en-us/articles/203959903-Bringing-up-downed-Solr-servers-that-don-t-want-to-come-up If the overseer queue gets too long you get hit by this:

Re: How long for autoAddReplica?

2017-02-02 Thread Hendrik Haddorp
Hi, are you using HDFS? According to the documentation the feature should be only available if you are using HDFS. For me it did however also fail on that. See the thread "Solr on HDFS: AutoAddReplica does not add a replica" from about two weeks ago. regards, Hendrik On 02.02.2017 07:21,

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Hendrik Haddorp
he Overseer just has to move the ownership of the replica, which seems like what the code is trying to do. There just seems to be a bug in the code so that the core does not get created on the target node. Each data directory also contains a lock file. The documentation states that one should us

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Hendrik Haddorp
Hi, I'm seeing the same issue on Solr 6.3 using HDFS and a replication factor of 3, even though I believe a replication factor of 1 should work the same. When I stop a Solr instance this is detected and Solr actually wants to create a replica on a different instance. The command for that does

huge amount of overseer queue entries

2017-01-18 Thread Hendrik Haddorp
Hi, I have a 6.2.1 solr cloud setup with 5 nodes containing close to 3000 collections having one shard and three replicas each. It looks like when nodes crash the overseer queue can go wild on grows until ZooKeeper is not working anymore correctly. This looks pretty much like SOLR-5961

Re: ClusterStateMutator

2017-01-05 Thread Hendrik Haddorp
The UI warning was quite easy to resolve. I'm currently testing Solr with HDFS but for some reason the core ended up on the local storage of the node. After a delete and restart the problem was gone. On 05.01.2017 12:42, Hendrik Haddorp wrote: Right, I had to do that multiple times already

Re: ClusterStateMutator

2017-01-05 Thread Hendrik Haddorp
eason than it's confusing. Times past the node needed to be there even if empty. Although I just tried removing it completely on 6x and I was able to start Solr, part of the startup process recreates it as an empty node, just a pair of braces. Best, Erick On Wed, Jan 4, 2017 at 1:22 PM, Hendr

Re: ClusterStateMutator

2017-01-04 Thread Hendrik Haddorp
ifying legacyCloud=false clusterprop > > Kind of a shot in the dark... > > Erick > > On Wed, Jan 4, 2017 at 11:12 AM, Hendrik Haddorp > <hendrik.hadd...@gmx.net> wrote: >> You are right, the code looks like it. But why did I then see collection >> data in the clusters

Re: ClusterStateMutator

2017-01-04 Thread Hendrik Haddorp
teMutator is executed. > > On Wed, Jan 4, 2017 at 6:16 PM, Hendrik Haddorp <hendrik.hadd...@gmx.net> > wrote: >> Hi, >> >> in >> solr-6.3.0/solr/core/src/java/org/apache/solr/cloud/overseer/ClusterStateMutator.java >> there is the following code starting line

Re: create collection gets stuck on node restart

2017-01-04 Thread Hendrik Haddorp
Heisey wrote: On 1/3/2017 2:59 AM, Hendrik Haddorp wrote: I have a SolrCloud setup with 5 nodes and am creating collections with a replication factor of 3. If I kill and restart nodes at the "right" time during the creation process the creation seems to get stuck. Collection data is left i

ClusterStateMutator

2017-01-04 Thread Hendrik Haddorp
Hi, in solr-6.3.0/solr/core/src/java/org/apache/solr/cloud/overseer/ClusterStateMutator.java there is the following code starting line 107: //TODO default to 2; but need to debug why BasicDistributedZk2Test fails early on String znode = message.getInt(DocCollection.STATE_FORMAT, 1) == 1

HDFS support maturity

2017-01-03 Thread Hendrik Haddorp
Hi, is the HDFS support in Solr 6.3 considered production ready? Any idea how many setups might be using this? thanks, Hendrik

deleting a collection leaves empty directories in an HDFS setup

2017-01-03 Thread Hendrik Haddorp
Hi, playing around with Solr 6.3 and HDFS I noticed that after deleting a collection the directories for the Solr cores are left in HDFS. There is no date left in them but still this doesn't look clean to me. regards, Hendrik

create collection gets stuck on node restart

2017-01-03 Thread Hendrik Haddorp
Hi, I have a SolrCloud setup with 5 nodes and am creating collections with a replication factor of 3. If I kill and restart nodes at the "right" time during the creation process the creation seems to get stuck. Collection data is left in the clusterstate.json file in ZooKeeper and no

Re: Soft commit and reading data just after the commit

2016-12-19 Thread Hendrik Haddorp
Hi, the SolrJ API has this method: SolrClient.commit(String collection, boolean waitFlush, boolean waitSearcher, boolean softCommit). My assumption so far was that when you set waitSearcher to true that the method call only returns once a search would find the new data, which sounds what you

  1   2   >