Re: Oracle OpenJDK to Amazon Corretto OpenJDK

2020-01-31 Thread Daniel Collins
I would ask your legal team why they want to get away from oracle? Because it’s oracle is not a good enough reason. Because we don’t want to pay support fees is valid but not applicable here, because we don’t want to be tied to a single vendor is valid but questionable here, you can at least

Re: Solr7 org.apache.lucene.index.IndexUpgrader

2017-11-27 Thread Daniel Collins
Leo, the general rule of thumb here is that the Solr index should *not* be your main document store. It is the index to your document store, but if it needs to be re-indexed, you should use your document store as the place to index from. Your index will not have the full source data (unless ALL

Re: Java 9

2017-11-07 Thread Daniel Collins
Oh, blimey, have Oracle gone with Ubuntu-style numbering now? :) On 7 November 2017 at 08:27, Markus Jelsma wrote: > Shawn, > > There won't be a Java 10, we'll get Java 18.3 instead. After 9 it is a > guess when CMS and friends are gone. > > Regards, > Markus > > > >

Re: mvn test failing

2017-10-31 Thread Daniel Collins
gt;SolrTestCaseJ4.initCore:552->SolrTestCaseJ4.initCore:678->SolrTestCaseJ4.createCore:688 » Runtime TestICUCollationField.org.apache.solr.schema.TestICUCollationField » ThreadLeak They work fine from ant strangely... Will try to dig a bit. On 31 October 2017 at 14:53, Daniel C

Re: mvn test failing

2017-10-31 Thread Daniel Collins
Another important question is which branch did you download? I assume master as its the default, but remember that is a development branch, so it is entirely possible to have some test issues on that. On 31 October 2017 at 13:44, Shawn Heisey wrote: > On 10/28/2017 11:48

Re: compiling Solr

2017-07-13 Thread Daniel Collins
That page was last edited in 2014, things have moved on a little since then! Solr doesn't produce a WAR file by default anymore, as running in a generic servlet container isn't a supported configuration. What is produced from ant dist is effectively the exploded form of the WAR. You can still

Re: Can SOLR-5730 patch be backported to Solr 5.5.3

2017-02-15 Thread Daniel Collins
The other question is what do you hope to gain from SortingMergePolicy and EarlyTerminatingSortingCollector, and why would you want to do that in Solr 5.5.3 and not upgrade to Solr 6? What prevents you from upgrading I guess is my real question? On 15 February 2017 at 05:06, Erick Erickson

Re: compilation error

2016-11-17 Thread Daniel Collins
Also, remember a significant number of the people on this group are in the US. Asking for a rapid response at 1am is a pretty harsh SLA expectation... On 17 November 2016 at 08:51, Daniel Collins <danwcoll...@gmail.com> wrote: > Can you be more specific? What version are you compil

Re: compilation error

2016-11-17 Thread Daniel Collins
Can you be more specific? What version are you compiling, what command do you use? That looks to me like maven output, not ant? On 17 November 2016 at 06:30, Midas A wrote: > Please reply? > > On Thu, Nov 17, 2016 at 11:31 AM, Midas A wrote: > > >

Re: Solr DeleteByQuery vs DeleteById

2016-08-09 Thread Daniel Collins
Seconding that point, we currently do DBQ to "tidy" some of our collections and time-bound them (so running "delete anything older than X"). They have similar issues with reordering and blocking from time to time. On 9 August 2016 at 14:20, danny teichthal wrote: > Hi

Re: Unique key field type in solr 6.1 schema

2016-08-09 Thread Daniel Collins
This vaguely rings a bell, though from a long time ago. We had our id field using the "lowercase" type in Solr, and that broke/changed somewhere in the 4.x series (we are on 4.8.1 now and it doesn't work there), so we have to revert to a simple "string" type instead. I know you have a very

Re: Installing Solr as a dependency

2016-07-29 Thread Daniel Collins
Can't you use Maven? I thought that was the standard dependency management tool, and Solr is published to Maven repos. There used to be a solr artifact which was the WAR file, but presumably now, you'd have to pull down org.apache.solr solr-parent and maybe then start that up. We have an

Re: Switching zk node cause load conf error

2016-05-20 Thread Daniel Collins
Zk holds more than just the Solr config, it holds a copy of the clusterstate, which includes all the sharding, hash ranges, etc as well. You will need to re-create that data on your new ZK instance, i.e. re-create the collection to populate that. You do realize that running a single ZK instance

Re: SolrCloud replicas consistently out of sync

2016-05-17 Thread Daniel Collins
Terminology question: by nodes I assume you mean machines? So "8 nodes, with 4 shards a piece, all running one collection with about 900M documents", is 1 collection split into 32 shards, with 4 shards located on each machine? Is each shard in its own JVM, or do you have 1 JVM on each machine

Re: number of zookeeper & aws instances

2016-04-13 Thread Daniel Collins
@elyograg.org> wrote: > On 4/13/2016 9:34 AM, Daniel Collins wrote: > > Just to chip in, more ZKs are probably only necessary if you are doing > NRT > > indexing. > > > > Loss of a single ZK (in a 3 machine setup) will block indexing for the > time > > it ta

Re: number of zookeeper & aws instances

2016-04-13 Thread Daniel Collins
Just to chip in, more ZKs are probably only necessary if you are doing NRT indexing. Loss of a single ZK (in a 3 machine setup) will block indexing for the time it takes to get that machine/instance back up, however it will have less impact on search, since the search side can use the existing

Re: Solr Sharding Strategy

2016-04-11 Thread Daniel Collins
I'd also ask about your indexing times, what QTime do you see for indexing (in both scenarios), and what commit times are you using (which Toke already asked). Not entirely sure how to read your table, but looking at the indexing side of things, with 2 shards, there is inherently more work to do,

Re: Deploy solr on glassfish

2016-03-21 Thread Daniel Collins
You have already asked this question, and there is a thread on going on this? To quote the previous thread, Solr is no longer a webapp that can be deployed on any servlet container, it is now a black-box application, so you should just deploy Solr as it is, and then connect to it yourself, which

Re: mergeFactor/maxMergeDocs is deprecated

2016-03-03 Thread Daniel Collins
See https://issues.apache.org/jira/browse/SOLR-8734, it will be fixed in the next release On 3 March 2016 at 17:38, Tom Evans wrote: > Hi all > > Updating to Solr 5.5.0, and getting these messages in our error log: > > Beginning with Solr 5.5, is deprecated, configure

Re: Solr architecture

2016-02-09 Thread Daniel Collins
So as I understand your use case, its effectively logging actions within a user session, why do you have to do the update in NRT? Why not just log all the user session events (with some unique key, and ensuring the session Id is in the document somewhere), then when you want to do the query, you

Re: Solr 5 with java 7 or java 8

2016-01-19 Thread Daniel Collins
Solr 5.x is compiled for Java7 bytecode, so it will definitely work on a Java 7 VM. It is Solr 6 which will be Java 8 only. That said, I would advocate using Java 8. Especially as Java 7 is already publicly End of Life'd, starting a new project on Java 7 would seem short-sighted. On 19 January

Re: SolR 5.3.1 deletes index files

2016-01-15 Thread Daniel Collins
Can I just clarify something. The title of this thread implies Solr is losing data when it shuts down which would be really bad(!) The core isn't deleting any data, it is performing a merge, so the data exists, just in fewer larger segments instead of all the smaller segments you had before. So

Re: SolR 5.3.1 deletes index files

2016-01-15 Thread Daniel Collins
I know Solr used to have issues with indexes on NFS, there was a segments.gen file specifically for issues around that, though that was removed in 5.0. But you say this happens on local disks too, so that would rule NFS out of it. I still think you should look at ensuring your merge policy is

Re: Does soft commit re-opens searchers in disk?

2016-01-04 Thread Daniel Collins
If you have already done a soft commit and that opened a new searcher, then the document will be visible from that point on. The results returned by that searcher cannot be changed by the hard commit (whatever that is doing under the hood, the segment that has that document in must still be

Re: restore quorum after majority of zk nodes down

2015-10-30 Thread Daniel Collins
Aren't you asking for dynamic ZK configuration which isn't supported yet (ZOOKEEPER-107, only in in 3.5.0-alpha)? How do you swap a zookeeper instance from being an observer to a voting member? On 30 October 2015 at 09:34, Matteo Grolla wrote: > Pushkar... I love this

Re: how to deployed another web project into jetty server(solr inbuilt)

2015-10-07 Thread Daniel Collins
The short answer is technically it might be possible but its not a supported configuration. As of Solr 5.x (I forget the exact version), the use of Jetty is an implementation detail, you should treat Solr as a black box, whether it uses Jetty or not is irrelevant, and not something you can "piggy

Re: Solr Caching (documentCache) not working

2015-08-18 Thread Daniel Collins
I think this is expected. As Shawn mentioned, your hard commits have openSearcher=false, so they flush changes to disk, but don't force a re-open of the active searcher. By contrast softCommit, sets openSearcher=true, the point of softCommit is to make the changes visible so do to that you have

Re: Query time out. Solr node goes down.

2015-08-18 Thread Daniel Collins
Ah ok, its ZK timeout then (org.apache.zookeeper.KeeperException$SessionExpiredException) which is because of your GC pause. The page Shawn mentioned earlier has several links on how to investigate GC issues and some common GC settings, sounds like you need to tweak those. Generally speaking, I

Re: Query time out. Solr node goes down.

2015-08-17 Thread Daniel Collins
When you say the solr node goes down, what do you mean by that? From your comment on the logs, you obviously lose the solr core at best (you do realize only having a single replica is inherently susceptible to failure, right?) But do you mean the Solr Core drops out of the collection (ZK timeout),

Re: Solr Caching (documentCache) not working

2015-08-17 Thread Daniel Collins
Just to open the can of worms, it *can* be possible to have very low commit times, we have 250ms currently and are in production with that. But it does come with pain (no such thing as a free lunch!), we had to turn off ALL the Solr caches (warming is useless at that kind of frequency, it will

Re: Solr 4.10.4 - Index is bigger before optimize for the same data in 4.6.1

2015-07-22 Thread Daniel Collins
I know some of the docValue APIs changed in 4.10, because we had to re-code some custom stuff, looks like https://issues.apache.org/jira/browse/LUCENE-5882 changed the format on disk too. The comments on that ticket don't suggest an 8% increase in disk space, so maybe you are hitting some kind of

Re: problem with index size

2015-07-22 Thread Daniel Collins
Why are most of your fields stored but not indexed? That suggests to me that you are using Solr as your primary data store, not as an index (which is not Solr's ideal use case) Secondly, I think there is confusion around the term segments. You have a field called segment in your schema, but

Re: solr blocking and client timeout issue

2015-07-21 Thread Daniel Collins
We have a similar situation: production runs Java 7u10 (yes, we know its old!), and has custom GC options (G1 works well for us), and a 40Gb heap. We are a heavy user of NRT (sub-second soft-commits!), so that may be the common factor here. Every time we have tried a later Java 7 or Java 8, the

Re: SOLR nrt read writes

2015-07-15 Thread Daniel Collins
Just to re-iterate Charles' response with an example, we have a system which needs to be as Near RT as we can make it. So we have application level commitWith set to 250ms. Yes, we have to turn off a lot of caching, auto-warming, etc, but it was necessary to make the index as real time as we

Re: Solr relevancy score in percentage

2015-05-26 Thread Daniel Collins
The question is more why do you want your users to see the scores? If they are wanting to affect ranking, what you want is the ability to run the same query with different boosting and see the difference (2 result sets), then see if the new ordering is better or worse. What the actual/raw score

Re: YAJar

2015-05-26 Thread Daniel Collins
I guess this is one reason why the whole WAR approach is being removed! Solr should be a black-box that you talk to, and get responses from. What it depends on and how it is deployed, should be irrelevant to you. If you are wanting to override the version of guava that Solr uses, then you'd have

Re: Limit the documents for each shard in solr cloud

2015-05-07 Thread Daniel Collins
Not sure I understand your problem. If you have 20m documents, and 8 shards, then each shard is (broadly speaking) only going to have 2.5m docs each, so I don't follow the 5m limit? That is with the default routing/hashing, obviously you can write your own hash algorithm or you can shard at your

Re: Limit the documents for each shard in solr cloud

2015-05-07 Thread Daniel Collins
Jilani, you did say My team needs that option if at all possible, my first response would be why?. Why do they want to limit the number of documents per shard, what's the rationale/use case behind that requirement? Once we understand that, we can explain why its a bad idea. :) I suspect I'm

Re: Solr 5.0 - uniqueKey case insensitive ?

2015-05-06 Thread Daniel Collins
Ah, I remember seeing this when we first started using Solr (which was 4.0 because we needed Solr Cloud), I never got around to filing an issue for it (oops!), but we have a note in our schema to leave the key field a normal string (like Bruno we had tried to lowercase it which failed). We didn't

Re: Using G1 with Apache Solr

2015-03-25 Thread Daniel Collins
Interesting none the less Shawn :) We use G1GC on our servers, we were on Java 7 (64-bit, RHEL6), but are trying to migrate to Java 8 (which seems to cause more GC issues, so we clearly need to tweak our settings), will investigate 8u40 though. On 25 March 2015 at 04:23, Shawn Heisey

Re: How to select the correct number of Shards in SolrCloud

2015-01-16 Thread Daniel Collins
Sharding a query lets you parallel the actual querying the index part of the search. But remember that as soon as you spread the query out more, you also need to bring all 64 results sets back together and consolidate them into a single result set for the end user. At some point, the gain of

Re: SolrCloud shard leader elections - Altering zookeeper sequence numbers

2015-01-13 Thread Daniel Collins
Is it important where your leader is? If you just want to minimize leadership changes during rolling re-start, then you could restart in the opposite order (S3, S2, S1). That would give only 1 transition, but the end result would be a leader on S2 instead of S1 (not sure if that important to you

Re: Slow forwarding requests to collection leader

2014-10-29 Thread Daniel Collins
I kind of think this might be working as designed, but I'll be happy to be corrected by others :) We had a similar issue which we discovered by accident, we had 2 or 3 collections spread across some machines, and we accidentally tried to send an indexing request to a node in teh cloud that didn't

Re: SolrCloud config question and zookeeper

2014-10-28 Thread Daniel Collins
As Michael says, you really want an odd number of zookeepers in order to meet the quorum requirements (which based on your comments you seem to be aware of). There is nothing wrong with 4 ZKs as such, just that it doesn't buy you anything above having 3, so its one more that might go wrong and

Re: SolrCloud without NRT and indexing only on the master

2014-07-30 Thread Daniel Collins
Working backwards slightly, what do you think SolrCloud is going to give you, apart from the consistency of the index (which you want to turn off)? What are all the other benefits of SolrCloud, if you are querying separate instances that aren't guaranteed to be in sync (since you want to use the

Re: CopyField can't copy analyzers and Filters

2014-07-01 Thread Daniel Collins
Ok, firstly to say you need to fix your problem but you can't modify the schema, doesn't really help. If the schema is setup badly, then no amount of help at search time will ever get you the results you want... Secondly, from what I can see in the schema, there is no AllChamp_fr, AllChamp_en,

Re: SolrCloud multiple data center support

2014-06-23 Thread Daniel Collins
functionality would have to wait for 3.5.x and I don't think there are any estimates on when that might come out... Until that is released officially, Solr can't really depend on it. On 23 June 2014 15:36, Arcadius Ahouansou arcad...@menelic.com wrote: On 3 February 2014 22:16, Daniel Collins

Re: About Query Parser

2014-06-20 Thread Daniel Collins
Alexandre's response is very thorough, so I'm really simplifying things, I confess but here's my query parsers for dummies. :) In terms of inputs/outputs, a QueryParser takes a string (generally assumed to be human generated i.e. something a user might type in, so maybe a sentence, a set of

Re: About Query Parser

2014-06-20 Thread Daniel Collins
or lucene.? http://localhost:8983/solr/collection1/select?q=*%3A*wt=xmlindent=true Thanks, Vivek On Fri, Jun 20, 2014 at 3:55 PM, Daniel Collins danwcoll...@gmail.com wrote: Alexandre's response is very thorough, so I'm really simplifying things, I confess

Re: Warning message logs on startup after upgrading to 4.8.1

2014-06-17 Thread Daniel Collins
I confess we had upgraded to 4.8.1 and totally missed these warnings! I'm guessing they might be related to the ManagedIndexSchemaFactory stuff, which is commented out in the example configs. We don't use any of the REST stuff ourselves, so I can't comment any further. I think you are ok as

Re: Inconsistent query times

2014-06-16 Thread Daniel Collins
Ok, so we have 2 different scenarios here, running a query through admin UI and I'm assuming there you are quoting qTimes from the )UI, direct to the relevant core, and running through CloudSolrServer (what you called running a query through ZK). Questions: - What timings are you using when you

Re: Solr Master-Slave fail-over across multiple data-centers

2014-06-13 Thread Daniel Collins
Why do you need to swap the replicas from one master to another? If you have a cross DC database that ensures both Masters are in sync, why not just tie SolrSlave-B1 and SolrSlave-B2 to SolrMaster-B at all times? Then you don't have any fail-over to do at all? We have multiple DCs and a similar

Re: clusterstate.json does not reflect current state of down versus active

2014-04-16 Thread Daniel Collins
We actually have a similar scenario, we have 64 cores per machine, and even that sometimes has issues when we shutdown all cores at once. We did start to write a force election for Shard X tool but it was harder than we expected, its still on our to-do list. Some context, we run 256 shards

Re: Solr interface

2014-04-07 Thread Daniel Collins
I have to agree with Shawn. We have a SolrCloud setup with 256 shards, ~400M documents in total, with 4-way replication (so its quite a big setup!) I had thought that HTTP would slow things down, so we recently trialed a JNI approach (clients are C++) so we could call SolrJ and get the benefits

Re: Fault Tolerant Technique of Solr Cloud

2014-02-27 Thread Daniel Collins
I can see what you mean, what you are expecting is a single host:port combination for The Cloud that always works, and you can call from your UI. That is perfectly possible, but its really not within the scope of Solr itself. What you should understand is that Solr provides is a cloud that has

Re: SolrCloud: How to replicate shard of another machine for failover?

2014-02-26 Thread Daniel Collins
This is only true the *first* time you start the cluster. As mentioned earlier, the correct way to assign shards to cores is to use the collection API. Failing that, you can start cores in a determined order, and the cores will assign themselves a shard/replica when they first start up. From

Re: Solr Searching Issue

2014-02-04 Thread Daniel Collins
You also said you have multiple instances ( 15) but are they all reading the same 8Gb data (in which case it must be static or you'd get locking problems) or is it partitioned/sharded somehow? I'd have the same questions as the others, query rates, how are your queries distributed over the

Re: Announce list

2014-02-03 Thread Daniel Collins
I have seen other projects that have a releases mailing list, the only use cases I can think of are: 1) users who want notifications about new releases, but don't want the flood of the full user-list. 2) historical searching to see how often releases were made. Given there isn't an official

Re: SolrCloud multiple data center support

2014-02-03 Thread Daniel Collins
Option a) doesn't really work out of the box, *if you need NRT support*. The main reason (for us at least) is the ZK ensemble and maintaining quorum. If you have a single ensemble, say 3 ZKs in 1 DC and 2 in another, then if you lose DC 2, you lose 2 ZKs and the rest are fine. But if you lose

Re: Replication Error

2014-01-03 Thread Daniel Collins
We see this a lot as well, my understanding is that recovery asks the leader for a list of the files that it should download, then it downloads them. But if the leader has been merging segments whilst this is going on (recovery is taking a reasonable period of time and you have an NRT system

Re: update doc with a xml-format string

2013-12-20 Thread Daniel Collins
Yes, but you are putting the root tag (which is an XML construct) as a value of an XML element, so it has to be encoded? You've put it in quotes, but that's not valid as far as XML is concerned. I'm not an XML expert but all the XML tags (root, conditionGroup, etc) have to encoded so they aren't

Re: update doc with a xml-format string

2013-12-20 Thread Daniel Collins
Alternatively, use something like http://www.w3schools.com/xml/xml_cdata.asp and put all your values in a CDATA block. Again, I'm not an XML guru but something like that should get you moving. On 20 December 2013 09:05, Daniel Collins danwcoll...@gmail.com wrote: Yes, but you are putting

Re: update doc with a xml-format string

2013-12-20 Thread Daniel Collins
What's the schema definition for that field? Are you stripping HTML in your analyzer chain? Can you run it through the analyzer screen in the admin UI to confirm that the raw data goes through as you expect? Can you add a document via the admin UI and see that the data in the index is correct?

Re: PeerSync Recovery fails, starting Replication Recovery

2013-12-19 Thread Daniel Collins
Are you using a NRT solution, how often do you commit? We see similar issues with PeerSync, but then we have a very active NRT system and we soft-commit sub-second, so since PeerSync has a limit of 100 versions before it decides its too much to do, if we try and PeerSync whilst indexing is

Re: Solr-839 and version 4.5 (XmlQueryParser)

2013-12-17 Thread Daniel Collins
from the Lucene one. Will try to get our patches updated and issued over Xmas. On 17 December 2013 14:53, Puneet Pawaia puneet.paw...@gmail.com wrote: Hi All, Not being a Java expert, I used Daniel Collins' modification to patch with version 4.0 source. It works for a start. Have not been able

Re: Xml Query Parser

2013-12-06 Thread Daniel Collins
You are right that the XmlQueryParser isn't completely/yet implemented in Solr. There is the JIRA mentioned above, which is still WIP, so you could use that as a basis and extend it. If you aren't familiar with Solr and Java, you might find that a struggle, in which case you might want to

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Daniel Collins
Not sure if you are really stating the problem here. If you don't use Solr sharding, (I also assume you aren't using SolrCloud), and I'm guessing you are a single core (but can you confirm). As I understand Solr's logic, for a single query on a single core, that will only use 1 thread (ignoring

Re: SOLR 4 not utilizing multi CPU cores

2013-12-05 Thread Daniel Collins
- Index is same and optimized. However, as I said in a previous mail the issue seems to be Surround Query Parser which is parsing the query in a different format. On Thu, Dec 5, 2013 at 2:24 PM, Daniel Collins danwcoll...@gmail.com wrote: Not sure if you are really stating the problem here

Re: Questions about commits and OOE

2013-12-04 Thread Daniel Collins
I'd second the use of jstack to check your threads. Each request (be it a search or update) will generate a request handler thread on the Solr side (unless you've set the limits in the HttpShardHandlerFactory (solr.xml for solr-wide faults and/or under the requestHandler in SolrConfig.xml), we

Re: syncronization between replicas

2013-11-27 Thread Daniel Collins
I think when a replica becomes leader, it tries to sync *from* all the other replicas to see if anyone else is more up to date than it is, then it syncs back out *to* the replicas. But that probably won't happen in your case, since when replica1 comes back (step 4) it is the only contender, so it

FYI real-time get handler is needed for Solr cloud recovery.

2013-11-25 Thread Daniel Collins
Just had an issue on our Solr cloud and wanted to point this out to the list at large. The real-time /get handler is used by Solr Cloud's sync/recovery mechanism, so *DO NOT* remove it from SolrConfig if you are using Solr Cloud! We did (because we weren't using real-time get ourselves and we

Re: Multiple data/index.YYYYMMDD.... dirs == bug?

2013-11-20 Thread Daniel Collins
In our experience (with SolrCloud), if you trigger a full replication (e.g. new replica), you get the timestamp directory, it never renames back to just index. Since index.properties gives you the name of the real directory, we had never considered that a problem/bug. Why bother with the rename

Re: Question regarding possibility of data loss

2013-11-19 Thread Daniel Collins
Regarding data loss, Solr returns an error code to the callling app (either HTTP error code, or equivalent in SolrJ), so if it fails to index for a known reason, you'll know about it. There are always edge cases though. If Solr indexes the document (returns success), that means the document is

Re: Please explain SolConfig.xml in terms of SolrAPIs (Java Psuedo Code)

2013-10-25 Thread Daniel Collins
I think what you are looking for is some kind of DTD/schema you can use to see all the possible parameters in SolrConfig.xml, short answer, there isn't one (currently) :( jetty.xml has a DTD schema, and its XMLConfiguration format is inherently designed to convert to code, so the list of possible

Re: New shard leaders or existing shard replicas depends on zookeeper?

2013-10-24 Thread Daniel Collins
Ah yes, I was about to mention that, -DnumShards is only actually used when the collection is being created for the first time. After that point (i.e. once the collection exists in ZK), passing it along the command line is redundant (Solr won't actually read it). I know preferred mechanism of

Re: SolrCloud - shard containing an invalid host:port

2013-09-03 Thread Daniel Collins
Was it a test instance that you created 8983 is the default port, so possibly you started an instance before you had the ports setup properly, and it registered in zookeeper as a valid instance. You can use the Core API to UNLOAD it (if it is still running), if it isn't running anymore, I have

Re: Data Centre recovery/replication, does this seem plausible?

2013-08-29 Thread Daniel Collins
a remote recovery option. Which is _still_ kind of tricky, I sure hope you have identical sharding schemes. FWIW, Erick On Wed, Aug 28, 2013 at 1:12 PM, Shawn Heisey s...@elyograg.org wrote: On 8/28/2013 10:48 AM, Daniel Collins wrote: What ideally I would like

Re: What does it mean when a shard is down in solr4.4?

2013-08-29 Thread Daniel Collins
Well if it is down, it means there is an error on that particular core/instance of Solr, you would need to check the logs on that instance to see what the underlying problem is, there is no one root cause. How to recover: fix the underlying problem and restart that Solr instance? :) With the

Re: why does a node switch state ?

2013-08-28 Thread Daniel Collins
Do you see anything in the solr logs as to what the trigger for your nodes changing state was? You should see some kind of error/warning before the election is triggered. My gut feeling would be loss of communication between your leader and ZK (possibly by a GC event that locks the JVM for a

Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Daniel Collins
We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed with the same indexing data. Now in the event of a DC outage, all our Solr instances go

Re: Data Centre recovery/replication, does this seem plausible?

2013-08-28 Thread Daniel Collins
:26, Shawn Heisey s...@elyograg.org wrote: On 8/28/2013 6:13 AM, Daniel Collins wrote: We have 2 separate data centers in our organisation, and in order to maintain the ZK quorum during any DC outage, we have 2 separate Solr clouds, one in each DC with separate ZK ensembles but both are fed

Re: Solr 4.4 Cloud always indexing to only one shard

2013-08-13 Thread Daniel Collins
I think I see the confusion. Erick is right that using collections API would sort the problem, but here is my rationale on why the confusion exists. There are 3 stages to creating a valid collection (well this is how I think of it) 1) Upload a solrconfig.xml/schema.xml (+ A N Other required

Re: Error while indexing in solrcloud

2013-08-09 Thread Daniel Collins
The shard update error in essence means the shard that received the update was trying to forward it on to the leader of that shard. Do you send all your indexing requests to 1 node (though that doesn't really matter here)? The error 503 normally means Solr is down at the remote end, are they all

Solr 4.4. creating an index that 4.3 can't read (but in LUCENE_43 mode)

2013-08-07 Thread Daniel Collins
I had been running a Solr 4.3.0 index, which I upgraded to 4.4.0 (but hadn't changed LuceneVersion, so it was still using the LUCENE_43 codec). I then had to back-out and return to a 4.3 system, and got an error when it tried to read the index. Now, it was only a dev system, so not a problem,

Re: Solr 4.4. creating an index that 4.3 can't read (but in LUCENE_43 mode)

2013-08-07 Thread Daniel Collins
...@elyograg.org wrote: On 8/7/2013 3:33 AM, Daniel Collins wrote: I had been running a Solr 4.3.0 index, which I upgraded to 4.4.0 (but hadn't changed LuceneVersion, so it was still using the LUCENE_43 codec). I then had to back-out and return to a 4.3 system, and got an error when it tried to read

Does Solr 4.4 support deploying with no cores or is that only later?

2013-07-26 Thread Daniel Collins
I think I've confused myself here (not hard these days!), I have the branch_4x code checked out, and that version definitely supports starting Solr with no cores at all. I still get an Admin UI and I can then use that to create cores/collections starting from a clean slate. Does that work in

Re: Switching to using SolrCloud with tomcat7 and embedded zookeeper

2013-07-17 Thread Daniel Collins
You've specified bootstrap_confdir and the same collection.configName on all your cores, so as each of them start, each will be uploading its own configuration to the collection1_conf area of ZK, so they will all be overwriting each other. Are your 4 cores replicas of the same collection or are

Are analysers applied to each value in a multi-valued field separately?

2013-07-16 Thread Daniel Collins
I'm guessing the answer is yes, but here's the background. We index 2 separate fields, headline and body text for a document, and then we want to identify the top of the story which is th headline + N words of the body (we want to weight that in scoring). So do to that: copyField src=headline

Re: SolrCloud softcommit problem

2013-07-16 Thread Daniel Collins
I think this is SOLR-4923 https://issues.apache.org/jira/browse/SOLR-4923, should be fixed in 4.4 (when it comes out) or grab the branch_4x branch from svn. On 16 July 2013 14:12, giovanni.bricc...@banzai.it giovanni.bricc...@banzai.it wrote: Hi I'm using solr version 4.3.1. I have a core

Re: Are analysers applied to each value in a multi-valued field separately?

2013-07-16 Thread Daniel Collins
token position - each successive analyzed value would have an incremented position, plus the positionIncrementGap (typically 100 for text.) -- Jack Krupansky -Original Message- From: Daniel Collins Sent: Tuesday, July 16, 2013 8:46 AM To: solr-user@lucene.apache.org Subject

Re: Are analysers applied to each value in a multi-valued field separately?

2013-07-16 Thread Daniel Collins
Self-correction, we'd need to set LimitTokenPositionFilterFactor**y to PI + N to give the results above because of the increment gap between values. On 16 July 2013 17:16, Daniel Collins danwcoll...@gmail.com wrote: Thanks Jack. There seem to be a never ending set of FilterFactories, I keep

Re: Need advice on performing 300 queries per second on solr index

2013-07-16 Thread Daniel Collins
You only have a 20Gb collection but is that per machine or total collection, so 10Gb per machine? What memory do you have available on those 2 machines, is it enough to get the collection into the disk cache? What OS is it (linux/windows, etc)? What heap size does your JVM have? Is it a static

Re: Calculating Solr document score by ignoring the boost field.

2013-07-10 Thread Daniel Collins
Sorry to repeat Jacks' previous answer but x times zero is always zero :) A index boost is just what the name suggests, a factor by which the document score is boosted (multiplied). Since it is in an index time value, it is stored alongside the document, so any future scoring of the document by

Re: Norms

2013-07-10 Thread Daniel Collins
I don't know the full answer to your question, but here's what I can offer. Solr offers 2 types of normalisation, FieldNorm and QueryNorm. FieldNorm is as the name suggests field level normalisation, based on length of the field, and can be controlled by the omitNorms parameter on the field. In

Re: Solr Hangs During Updates for over 10 minutes

2013-07-10 Thread Daniel Collins
We had something similar in terms of update times suddenly spiking up for no obvious reason. We never got quite as bad as you in terms of the other knock on effects, but we certainly saw updates jumping from 10ms up to 3ms, all our external queues backed up and we rejected some updates, then

Re: Solr Live Nodes not updating immediately

2013-07-10 Thread Daniel Collins
What do you have your ZK Timeout set to (zkClientTimeout in solr.xml or command line if you override it)? A kill of the raw process is bad, but ZK should spot that using its heartbeat mechanism, so unless your timeout is very large, it should be detecting the node is no longer available, and then

Re: PropagateServer Implementation for Solr

2013-07-04 Thread Daniel Collins
Ok, in the scenario where the calling app uses SolrJ and creates a CloudSolrServer to send all its requests in. In that case, yes I can see the logic that says CloudSolrServer shouldn't load balance that (its not that type of request), it should forward it on to all the servers in the cloud.

Re: Auto Soft commit not working !!!

2013-07-04 Thread Daniel Collins
You should see the commit messages in the solr logs, do they come up at the expected frequency? On 4 July 2013 15:35, Rohit Kumar rohit.kku...@gmail.com wrote: My solr config has : autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit !--

Re: OOM killer script woes

2013-07-02 Thread Daniel Collins
. Cheers, Tim On Wed, Jun 26, 2013 at 2:43 PM, Daniel Collins danwcoll...@gmail.com wrote: Ooh, I guess Jetty is trapping that java.lang.OutOfMemoryError, and throwing it/packaging it as a java.lang.RuntimeException. The -XX option assumes that the application doesn't handle the Errors

Re: undefined field http:// while searchi query

2013-07-02 Thread Daniel Collins
Presuming that uses the standard lucene query parser syntax then you have asked to query for the field called http, searching for the value // www.google.co.in See http://wiki.apache.org/solr/SolrQuerySyntax for more details, but you probably want to escape the : at least, http\://www.google.co.in

  1   2   >