Re: SolrJ appears to have problems with Docker Toolbox

2017-04-11 Thread Mike Thomsen
e ip address your VM (the public network > interface where the docker instance is mapped). > > Hope you understand... :) > > Cheers, > Vincenzo > > > On Sun, Apr 9, 2017 at 2:42 AM, Mike Thomsen <mikerthom...@gmail.com> > wrote: > > > I'm running two nodes of Sol

Re: SolrJ appears to have problems with Docker Toolbox

2017-04-08 Thread Mike Thomsen
think that this question would get better help in a Docker forum. > Cheers -- Rick > > On April 8, 2017 8:42:13 PM EDT, Mike Thomsen <mikerthom...@gmail.com> > wrote: > >I'm running two nodes of SolrCloud in Docker on Windows using Docker > >Toolbox. The problem I am

SolrJ appears to have problems with Docker Toolbox

2017-04-08 Thread Mike Thomsen
I'm running two nodes of SolrCloud in Docker on Windows using Docker Toolbox. The problem I am having is that Docker Toolbox runs inside of a VM and so it has an internal network inside the VM that is not accessible to the Docker Toolbox VM's host OS. If I go to the VM's IP which is

Re: Data Import

2017-03-17 Thread Mike Thomsen
If Solr is down, then adding through SolrJ would fail as well. Kafka's new API has some great features for this sort of thing. The new client API is designed to be run in a long-running loop where you poll for new messages with a certain amount of defined timeout (ex: consumer.poll(1000) for 1s)

Re: SOLR Data Locality

2017-03-17 Thread Mike Thomsen
I've only ever used the HDFS support with Cloudera's build, but my experience turned me off to use HDFS. I'd much rather use the native file system over HDFS. On Tue, Mar 14, 2017 at 10:19 AM, Muhammad Imad Qureshi < imadgr...@yahoo.com.invalid> wrote: > We have a 30 node Hadoop cluster and each

How to expose new Lucene field type to Solr

2017-03-02 Thread Mike Thomsen
Found this project and I'd like to know what would be involved with exposing its RestrictedField type through Solr for indexing and querying as a Solr field type. https://github.com/roshanp/lucure-core Thanks, Mike

Re: solr warning - filling logs

2017-02-27 Thread Mike Thomsen
It's a brittle ZK configuration. A typical ZK quorum is three nodes for most production systems. One is fine, though, for development provided the system it's on is not overloaded. On Mon, Feb 27, 2017 at 6:43 PM, Rick Leir wrote: > Hi Mike > We are using a single ZK node, I

Re: Index Segments not Merging

2017-02-27 Thread Mike Thomsen
Just barely skimmed the documentation, but it looks like the tool generates its own shards and pushes them into the collection by manipulating the configuration of the cluster. https://www.cloudera.com/documentation/enterprise/5-8-x/topics/search_mapreduceindexertool.html If that reading is

Re: solr warning - filling logs

2017-02-27 Thread Mike Thomsen
When you transition to an external zookeeper, you'll need at least 3 ZK nodes. One is insufficient outside of a development environment. That's a general requirement for any system that uses ZK. On Sun, Feb 26, 2017 at 7:14 PM, Satya Marivada wrote: > May I ask about

Re: Fwd: Solr dynamic field blowing up the index size

2017-02-21 Thread Mike Thomsen
Correct me if I'm wrong, but heavy use of doc values should actually blow up the size of your index considerably if they are in fields that get sent a lot of data. On Tue, Feb 21, 2017 at 10:50 AM, Pratik Patel wrote: > Thanks for the reply. I can see that in solr 6, more

Re: Solr partial update

2017-02-09 Thread Mike Thomsen
Set the fl parameter equal to the fields you want and then query for id:(SOME_ID OR SOME_ID OR SOME_ID) On Thu, Feb 9, 2017 at 5:37 AM, Midas A wrote: > Hi, > > i want solr doc partially if unique id exist else we donot want to do any > thing . > > how can i achieve this .

Re: Solr Kafka DIH

2017-01-31 Thread Mike Thomsen
Probably not, but writing your own little Java process to do it would be trivial with Kafka 0.9.X or 0.10.X. You can also look at the Confluent Platform as they have tons of connectors for Kafka to directly feed into other systems. On Mon, Jan 30, 2017 at 3:05 AM, Mahmoud Almokadem

Re: Is it possible to rewrite part of the solr response?

2017-01-18 Thread Mike Thomsen
; > All that said, if there's any way you can build this into tokens in the > doc and use a standard fq clause it's usually much easier. That may > take some creative work at indexing time if it's even possible. > > Best, > Erick > > On Wed, Dec 21, 2016 at 5:56 PM, Mike Thoms

Re: Solr ACL Plugin Windows

2017-01-04 Thread Mike Thomsen
I didn't see a real Java project there, but the directions to compile on Linux are almost always applicable to Windows with Java. If you find a project that says it uses Ant or Maven, all you need to do is download Ant or Maven, the Java Development Kit and put both of them on the windows path.

Re: HDFS support maturity

2017-01-03 Thread Mike Thomsen
Cloudera defaults their Hadoop installation to use HDFS w/ their bundle of Solr (4.10.3) if that is any indication. On Tue, Jan 3, 2017 at 7:40 AM, Hendrik Haddorp wrote: > Hi, > > is the HDFS support in Solr 6.3 considered production ready? > Any idea how many setups

Re: Is it possible to rewrite part of the solr response?

2016-12-21 Thread Mike Thomsen
uot;. Of course if your business logic is such that > you can calculate them all "fast enough", you're golden. > > All that said, if there's any way you can build this into tokens in the > doc and use a standard fq clause it's usually much easier. That may > take some creative wo

Is it possible to rewrite part of the solr response?

2016-12-21 Thread Mike Thomsen
We're trying out some ideas on locking down solr and would like to know if there is a public API that allows you to grab the response before it is sent and inspect it. What we're trying to do is something for which a filter query is not a good option to really get where we want to be. Basically,

Replica document counts out of sync

2016-11-30 Thread Mike Thomsen
In one of our environments, we have an issue where one shard has two replicas with smaller document counts than the third one. This is on Solr 4.10.3 (Cloudera's build). We've found that shutting down the smaller replicas, deleting their data folders and restarting one by one will do the trick of

Detecting schema errors while adding documents

2016-11-16 Thread Mike Thomsen
We're stuck on Solr 4.10.3 (Cloudera bundle). Is there any way to detect with SolrJ when a document added to the index violated the schema? All we see when we look at the stacktrace for the SolrException that comes back is that it contains messages about an IOException when talking to the solr

Re: Rolling backups of a collection

2016-11-09 Thread Mike Thomsen
plements this functionality outside of Solr. If you post that > > script, may be we can even ship it as part of Solr itself (for the > benefit > > of the community). > > > > Thanks > > Hrishikesh > > > > > > > > On Wed, Nov 9, 2016 at 9:17 A

Rolling backups of a collection

2016-11-09 Thread Mike Thomsen
I read over the docs ( https://cwiki.apache.org/confluence/display/solr/Making+and+Restoring+Backups) and am not quite sure what route to take. My team is looking for a way to backup the entire index of a SolrCloud collection with regular rotation similar to the backup option available in a single

Backup to HDFS while running cluster on local disk

2016-11-08 Thread Mike Thomsen
We have SolrCloud running on bare metal but want the nightly snapshots to be written to HDFS. Can someone give me some help on configuring the HdfsBackupRepository? ${solr.hdfs.default.backup.path} ${solr.hdfs.home:} ${solr.hdfs.confdir:} Not sure how to

Best way to generate multivalue fields from streaming API

2016-09-16 Thread Mike Thomsen
Read this article and thought it could be interesting as a way to do ingestion: https://dzone.com/articles/solr-streaming-expressions-for-collection-auto-upd-1 Example from the article: daemon(id="12345", runInterval="6", update(users, batchSize=10,

Update command not working

2016-02-26 Thread Mike Thomsen
I posted this to http://localhost:8983/solr/default-collection/update and it treated it like I was adding a whole document, not a partial update: { "id": "0be0daa1-a6ee-46d0-ba05-717a9c6ae283", "tags": { "add": [ "news article" ] } } In the logs, I found this: 2016-02-26

Re: /select changes between 4 and 5

2016-02-24 Thread Mike Thomsen
uld > read/accept other content types? > > -Yonik > > > On Wed, Feb 24, 2016 at 8:48 AM, Mike Thomsen <mikerthom...@gmail.com> > wrote: > > With 4.10, we used to post JSON like this example (part of it is Python) > to > > /select: > > > > { >

/select changes between 4 and 5

2016-02-24 Thread Mike Thomsen
With 4.10, we used to post JSON like this example (part of it is Python) to /select: { "q": "LONG_QUERY_HERE", "fq": fq, "fl": ["id", "title", "date_of_information", "link", "search_text"], "rows": 100, "wt": "json", "indent": "true", "_": int(time.time()) } We just

Leader election issues after upgrade from 4.10.4 to 5.4.1

2016-02-08 Thread Mike Thomsen
We get this error on one of our nodes: Caused by: org.apache.solr.common.SolrException: There is conflicting information about the leader of shard: shard2 our state says: http://server01:8983/solr/collection/ but zookeeper says: http://server02:8983/collection/ Then I noticed this in the log:

zkCli.sh not in solr 5.4?

2016-01-19 Thread Mike Thomsen
I downloaded a build of 5.4.0 to install in some VMs and noticed that zkCli.sh is not there. I need it in order to upload a configuration set to ZooKeeper before I create the collection. What's the preferred way of doing that? Specifically, I need to specify a configuration like this because it's

Phrase query not matching exact tokens in some cases

2015-07-14 Thread Mike Thomsen
For the query police office our users are getting back highlighted results for police office*r* (and police office*rs*) I get why a search for police officers would include just office since the stemmer would cause that behavior. However I don't understand why office is matching officer here when

Re: Exact phrase search on very large text

2015-06-26 Thread Mike Thomsen
limit for individual terms. Use tokenized text instead. -- Jack Krupansky On Thu, Jun 25, 2015 at 8:36 PM, Mike Thomsen mikerthom...@gmail.com wrote: I need to be able to do exact phrase searching on some documents that are a few hundred kb when treated as a single block of text. I'm

Exact phrase search on very large text

2015-06-25 Thread Mike Thomsen
I need to be able to do exact phrase searching on some documents that are a few hundred kb when treated as a single block of text. I'm on 4.10.4 and it complains when I try to put something larger than 32kb in using a textfield with the keyword tokenizer as the tokenizer. Is there any way I can

ManagedStopFilterFactory not accepting ignoreCase

2015-06-17 Thread Mike Thomsen
We're running Solr 4.10.4 and getting this... Caused by: java.lang.IllegalArgumentException: Unknown parameters: {ignoreCase=true} at org.apache.solr.rest.schema.analysis.BaseManagedTokenFilterFactory.init(BaseManagedTokenFilterFactory.java:46) at

Exact phrase search not working

2015-06-11 Thread Mike Thomsen
This is my field definition: fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter

Re: Shard still around after calling splitshard

2015-06-04 Thread Mike Thomsen
, Jun 4, 2015 at 10:35 AM, Mike Thomsen mikerthom...@gmail.com wrote: I thought splitshard was supposed to get rid of the original shard, shard1, in this case. Am I missing something? I was expecting the only two remaining shards to be shard1_0 and shard1_1. The REST call I used

Shard still around after calling splitshard

2015-06-04 Thread Mike Thomsen
I thought splitshard was supposed to get rid of the original shard, shard1, in this case. Am I missing something? I was expecting the only two remaining shards to be shard1_0 and shard1_1. The REST call I used was /admin/collections?collection=default-collectionshard=shard1action=SPLITSHARD if

Managed synonyms and Solr Java API

2015-04-29 Thread Mike Thomsen
Is there a way to manage synonyms through Solr's Java API? Google doesn't turn up any good results, and I didn't see anything in the javadocs that looked promising. Thanks, Mike

Using synonyms API

2015-04-15 Thread Mike Thomsen
We recently upgraded from 4.5.0 to 4.10.4. I tried getting a list of our synonyms like this: http://localhost/solr/default-collection/schema/analysis/synonyms/english I got a not found error. I found this page on new features in 4.8 http://yonik.com/solr-4-8-features/ Do we have to do

Re: Using synonyms API

2015-04-15 Thread Mike Thomsen
). -Yonik On Wed, Apr 15, 2015 at 11:11 AM, Mike Thomsen mikerthom...@gmail.com wrote: We recently upgraded from 4.5.0 to 4.10.4. I tried getting a list of our synonyms like this: http://localhost/solr/default-collection/schema/analysis/synonyms/english I got a not found error. I found

Re: Using synonyms API

2015-04-15 Thread Mike Thomsen
, joyful]}}} Verify that your URL has the correct port number (your example below doesn't), and that default-collection is actually the name of your default collection (and not collection1 which is the default for the 4x series). -Yonik On Wed, Apr 15, 2015 at 11:11 AM, Mike Thomsen

Re: Using the collections API to create a new collection

2015-03-15 Thread Mike Thomsen
the config set in zk with your collection. I think it would make a lot of sense for you to go through the getting started with SolrCloud section in the Solr Reference Guide for 4.5. On Sat, Mar 14, 2015 at 12:02 PM, Mike Thomsen mikerthom...@gmail.com wrote: I looked in the tree view and I have

Re: Using the collections API to create a new collection

2015-03-15 Thread Mike Thomsen
and reads the config files and starts up. The replica does _not_ copy the files locally. HTH, Erick On Sun, Mar 15, 2015 at 6:16 AM, Mike Thomsen mikerthom...@gmail.com wrote: I tried that with upconfig, and it created it under /configs. Our ZK configuration data is under /dev-local-solr

Re: Using the collections API to create a new collection

2015-03-14 Thread Mike Thomsen
solrcloud Any ideas? Is there a way to force an update into zookeeper? Or should I just purge the zookeeper data? On Sat, Mar 14, 2015 at 3:02 PM, Mike Thomsen mikerthom...@gmail.com wrote: I looked in the tree view and I have only a node called configs. Nothing called configsets. That's a serious

Re: Using the collections API to create a new collection

2015-03-14 Thread Mike Thomsen
be existing, it should be the configset. It's often a bit confusing because if the configName is not specified, the default is to look for a config set of the same name as the collection being created. Best, Erick On Sat, Mar 14, 2015 at 10:26 AM, Mike Thomsen mikerthom...@gmail.com wrote

Using the collections API to create a new collection

2015-03-14 Thread Mike Thomsen
We're running SolrCloud 4.5.0. It's just a standard version of SolrCloud deployed in Tomcat, not something like the Cloudera distribution (I note that because I can't seem to find solrctl and other things referenced in the Cloudera tutorials). I'm trying to create a new Solr collection like this:

Solr cannot find solr.xml even though it's there

2014-12-20 Thread Mike Thomsen
I'm getting the following stacktrace with Solr 4.5.0 SEVERE: null:org.apache.solr.common.SolrException: Could not load SOLR configuration at org.apache.solr.core.ConfigSolr.fromFile(ConfigSolr.java:71) at org.apache.solr.core.ConfigSolr.fromSolrHome(ConfigSolr.java:98) at

Re: Solr cannot find solr.xml even though it's there

2014-12-20 Thread Mike Thomsen
at 3:40 PM, Shawn Heisey apa...@elyograg.org wrote: On 12/20/2014 12:27 PM, Mike Thomsen wrote: at java.lang.Thread.run(Thread.java:745) /solr.xml cannot start Solrcommon.SolrException: solr.xml does not exist in /opt/solr/solr-shard1

Need some help with solr not restarting

2014-08-11 Thread Mike Thomsen
I'm very new to SolrCloud. When I tried restarting our tomcat server running SolrCloud, I started getting this in our logs: SEVERE: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for

Re: Is Solr right for our project?

2010-09-28 Thread Mike Thomsen
for feature descriptions Coming to a trunk near you - see https://issues.apache.org/jira/browse/SOLR-1873 -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 27. sep. 2010, at 17.44, Mike Thomsen wrote: (I apologize in advance if I missed something in your

Is Solr right for our project?

2010-09-27 Thread Mike Thomsen
(I apologize in advance if I missed something in your documentation, but I've read through the Wiki on the subject of distributed searches and didn't find anything conclusive) We are currently evaluating Solr and Autonomy. Solr is attractive due to its open source background, following and price.

Newbie question about search behavior

2010-08-16 Thread Mike Thomsen
Is it possible to set up Lucene to treat a keyword search such as title:News implicitly like title:News* so that any title that begins with News will be returned without the user having to throw in a wildcard? Also, are there any common filters and such that are generally considered a good