Re: Indexing part of Binary Documents and not the entire contents

2018-06-21 Thread Shawn Heisey
On 6/20/2018 9:05 AM, neotorand wrote: I have a specific Requirement where i need to index below things Meta Data of any document Some parts from the Document that matches some keywords that i configure The first part i am able to achieve through ERH or FilelistEntityProcessor. I am

Re: CloudSolrClient - setDefaultCollection

2018-06-21 Thread Shawn Heisey
On 6/21/2018 5:04 AM, Greenhorn Techie wrote: While indexing, is there going to be any performance benefit to set the collection name first using setDefaultCollection

Re: Trouble using the MIGRATE command in the collections API on solr 7.3.1

2018-06-21 Thread Shawn Heisey
On 6/21/2018 7:08 AM, Matthew Faw wrote: For background, I’m using solr version 7.3.1 and lucene version 7.3.1 I have a solr collection with 2 shards and 3 replicas using the compositeId router. Each solr document has “id” as its unique key, where each id is of format DERP_${X}, where ${X}

Re: Delete By Query issue followed by Delete By Id Issues

2018-06-20 Thread Shawn Heisey
On 6/20/2018 3:46 PM, sujatha sankaran wrote: > Thanks,Shawn. Very useful information. > > Please find below the log details:- Is your collection using the implicit router?  You didn't say.  If it is, then I think you may not be able to use deleteById.  This is indeed a bug, one that has been

Re: Solr Upgrade DateField to TrieDateField

2018-06-20 Thread Shawn Heisey
On 6/20/2018 12:35 PM, Yunee Lee wrote: > I have two questions. > > 1. solr index on verion 4.6.0 and there are multiple date fields as the type > DateField in schema.xml > When I upgraded to version 5.2.1 with new data type Trie* for integer, float, > string and date. > Only date fields are

Re: some solr replicas down

2018-06-20 Thread Shawn Heisey
On 6/20/2018 6:39 AM, Satya Marivada wrote: Yes, there are some other errors that there is a javabin character 2 expected and is returning 60 which is "<" . This happens when the response is an error.  Error responses are sent in HTML format (so they render properly when viewed in a browser),

Re: Drive Change for Solr Setup

2018-06-20 Thread Shawn Heisey
On 6/20/2018 5:03 AM, Srinivas Muppu (US) wrote: Hi Solr Team,My Solr project installation setup and instances(including clustered solr, zk services and indexing jobs schedulers) is available in Windows 'E:\ ' drive in production environment. As business needs to remove the E:\ drive, going

Re: Solrcloud doesn't like relative path

2018-06-20 Thread Shawn Heisey
On 6/19/2018 5:47 PM, Sushant Vengurlekar wrote: Based on your suggestion I moved the helpers to be under configsets/conf so my new folder structure looks -configsets - conf helpers synonyms_vendors.txt - collection1 -conf

Re: Delete By Query issue followed by Delete By Id Issues

2018-06-20 Thread Shawn Heisey
On 6/15/2018 3:14 PM, sujatha sankaran wrote: We were initially having an issue with DBQ and heavy batch updates which used to result in many missing updates. After reading many mails in mailing list which mentions that DBQ and batch update do not work well together, we switched to DBI. But

Re: Import data from standalone solr into a solrcloud collection

2018-06-19 Thread Shawn Heisey
On 6/19/2018 11:50 AM, Sushant Vengurlekar wrote: > I created a solr cloud collection with 2 shards and a replication factor of > 2. How can I load data into this collection which I have currently stored > in a core on a standalone solr. I used the conf from this core on > standalone solr to

Re: sharding and placement of replicas

2018-06-19 Thread Shawn Heisey
On 6/15/2018 11:08 AM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote: > If I start with a collection X on two nodes with one shard and two replicas > (for redundancy, in case a node goes down): a node on host1 has > X_shard1_replica1 and a node on host2 has X_shard1_replica2: when I try > SPLITSHARD,

Re: Remove schema.xml in favor of managed-schema

2018-06-19 Thread Shawn Heisey
On 6/17/2018 6:48 PM, S G wrote: > I only wanted to know if schema.xml offer anything that managed-schema does > not. The only difference between the two is that there is a different filename and the managed version can be modified by API calls.  The schema format and what you can do within that

Re: Suggestions for debugging performance issue

2018-06-14 Thread Shawn Heisey
On 6/12/2018 12:06 PM, Chris Troullis wrote: > The issue we are seeing is with 1 collection in particular, after we set up > CDCR, we are getting extremely slow response times when retrieving > documents. Debugging the query shows QTime is almost nothing, but the > overall responseTime is like 5x

Re: Changing Field Assignments

2018-06-14 Thread Shawn Heisey
On 6/14/2018 12:10 PM, Terry Steichen wrote: > I don't disagree at all, but have a basic question: How do you easily > transition from a system using a dynamic schema to one using a fixed one? Not sure you need to actually transition.  Just remove the config in solrconfig.xml that causes Solr to

Re: Indexing to replica instead leader

2018-06-14 Thread Shawn Heisey
On 6/8/2018 3:56 AM, SOLR4189 wrote: > /When a document is sent to a Solr node for indexing, the system first > determines which Shard that document belongs to, and then which node is > currently hosting the leader for that shard. The document is then forwarded > to the current leader for

Re: Changing Field Assignments

2018-06-14 Thread Shawn Heisey
On 6/11/2018 2:02 PM, Terry Steichen wrote: > I am using Solr (6.6.0) in the automatic mode (where it discovers > fields).  It's working fine with one exception.  The problem is that > Solr maps the discovered "meta_creation_date" is assigned the type > TrieDateField.  > > Unfortunately, that type

Re: 7.3.1 creates thousands of threads after start up

2018-06-13 Thread Shawn Heisey
On 6/13/2018 4:04 AM, Markus Jelsma wrote: You mentioned shard handler tweaks, thanks. I see we have an incorrect setting there for maximumPoolSize, way too high, but that doesn't account for the number of threads created. After reducing the number, for dubious reasons, twice the number of

Re: Solr 7 + HDFS issue

2018-06-13 Thread Shawn Heisey
On 6/12/2018 10:14 PM, Joe Obernberger wrote: Thank you Shawn.  It looks like it is being applied.  This could be some sort of chain reaction where: Drive or server fails.  HDFS starts to replicate blocks which causes network congestion.  Solr7 can't talk, so initiates a replication process

Re: Solr 7 + HDFS issue

2018-06-12 Thread Shawn Heisey
On 6/11/2018 9:46 AM, Joe Obernberger wrote: > We are seeing an issue on our Solr Cloud 7.3.1 cluster where > replication starts and pegs network interfaces so aggressively that > other tasks cannot talk.  We will see it peg a bonded 2GB interfaces.  > In some cases the replication fails over and

Re: Hardware-Aware Solr Coud Sharding?

2018-06-12 Thread Shawn Heisey
On 6/12/2018 9:12 AM, Michael Braun wrote: > The way to handle this right now looks to be running additional Solr > instances on nodes with increased resources to balance the load (so if the > machines are 1x, 1.5x, and 2x, run 2 instances, 3 instances, and 4 > instances, respectively). Has anyone

Re: Solr sort multivalued field

2018-06-12 Thread Shawn Heisey
On 6/12/2018 2:56 AM, Marc Lammers wrote: I want to sort my data by a multivalued field. I add this to my query „*sort=field(foo,min) asc“*. The configuration in the schema for this field is The documentation for the field function says that the field must contain numeric docvalues.  Your

Re: clusterstate json check in Solrj

2018-06-11 Thread Shawn Heisey
On 6/11/2018 8:35 AM, Anil wrote: > for failure could be #1. but not sure why client started with different > zkHost. i used 127.0.01:2181/solr itself. zk nodes started with > 127.0.0.1:2181, 2182, 2183 > is there anyway i can figure this out and correct it ? Thanks. I suspect that the difference

Re: Get replica status in solrj

2018-06-11 Thread Shawn Heisey
On 6/11/2018 3:29 AM, y y wrote: question is I can manage to get the correct replica state if I start and stop solr using command line. how if solr failure or crashed, how can I get the correct replica state? As I have stated elsewhere on this mailing list, actual Solr crashes are extremely

Re: indexer used in solr

2018-06-11 Thread Shawn Heisey
On 6/11/2018 5:34 AM, Vivek Singh wrote: I am new to solr ,wanted to know which indexer is used in apache solr by default . I am not getting good results. It's built in.  There is no "which" ... that would imply that there's more than one choice, and as far as I am aware, once you get right

Re: clusterstate json check in Solrj

2018-06-11 Thread Shawn Heisey
On 6/11/2018 6:41 AM, Anil wrote: I was trying solrcloud cluster setup using solr 7.3.1 and it is up. Admin console looks good and queries in console are working fine. But solrj connection failing with following exception org.apache.solr.common.SolrException: Cannot connect to cluster at

Re: Relationship Between Number of Solr Replicas and Number of Zookeeper Nodes (if any)

2018-06-11 Thread Shawn Heisey
On 6/11/2018 5:47 AM, THADC wrote: Shawn, thanks. You say "at least two replicas per shard are required for high availability". So that would be a total of three nodes for that shard, correct? The smallest possible fault-tolerant Solr install is a total of three servers.  Two of them will run

Re: how to configure LastFieldValueUpdateProcessorFactory

2018-06-10 Thread Shawn Heisey
On 6/8/2018 1:33 PM, root23 wrote: > Can someone point to me what i am missing? i read the documentation but > couldn't fully understand how to configure this ? I think that this is the config you'll need:       transaction_type               lastfieldvalue   There are a couple of

Re: 7.3.1 creates thousands of threads after start up

2018-06-10 Thread Shawn Heisey
On 6/8/2018 8:59 AM, Markus Jelsma wrote: > 2018-06-08 14:02:47.382 ERROR (qtp1458849419-1263) [ ] o.a.s.s.HttpSolrCall > null:org.apache.solr.common.SolrException: Error trying to proxy request for > url: http://idx2:8983/solr/ > search/admin/ping > Caused by:

Re: Solr for Content Management

2018-06-10 Thread Shawn Heisey
On 6/7/2018 12:10 PM, Moenieb Davids wrote: > Challenges: > When performing full text searches without concurrently executing updates, > solr seems to be doing well. Running updates also does okish given the > nature of the transaction. However, when I run search and updates > simultaneously,

Re: Relationship Between Number of Solr Replicas and Number of Zookeeper Nodes (if any)

2018-06-10 Thread Shawn Heisey
On 6/8/2018 12:13 PM, THADC wrote: > I am having trouble getting a clear understanding of the relationship > between my 3-node zookeeper cluster and how those 3 nodes relate to solr > replicas (if at all). Since the replicas exist for failover purposes > (correct?) as opposed to for load balancing

Re: Setting preferred replica for query/read

2018-06-09 Thread Shawn Heisey
On 6/9/2018 8:14 AM, Zheng Lin Edwin Yeo wrote: Just to confirm my understanding, if I have 2 replicas, I should set both of them to either NRT replicas or TLOG replicas, and not one of each. Then I set one of them to be PULL replica, which will be used for searching? There are multiple

Re: UUIDUpdateProcessorFactory can cause duplicate documents?

2018-06-09 Thread Shawn Heisey
On 6/9/2018 1:15 AM, S G wrote: That means if I send {"color":"red", "size":"L"} once, UUIDUpdateProcessorFactory will generate an "id" X and if I send the same document {"color":"red", "size":"L"} again, UUIDUpdateProcessorFactory will not know that its the same document and will generate an

Re: 7.3.1 creates thousands of threads after start up

2018-06-08 Thread Shawn Heisey
On 6/8/2018 8:17 AM, Markus Jelsma wrote: > Our local test environment mini cluster goes nuts right after start up. It is > a two node/shard/replica collection starts up normally if only one node start > up. But as soon as the second node attempts to join the cluster, both nodes > go crazy,

Re: Can replace the IP with the hostname or some unique identifier for each node in Solr

2018-06-08 Thread Shawn Heisey
On 6/8/2018 6:52 AM, akshat wrote: My question -> Is it possible to some way we can trick the ​S​olr by replacing the IP which it shows in the graph to some unique identifier so that when swarm brings the new node it should still be pointing to the unique identifier name, not the IP. Each Solr

Re: Setting preferred replica for query/read

2018-06-07 Thread Shawn Heisey
On 6/7/2018 9:17 PM, Zheng Lin Edwin Yeo wrote: Thanks for your reply. As currently we are looking at having a replica to do indexing, and another replica to be use for searching, these 2 requests looks like it can archive this purpose. Will this be implemented in the Solr 7.4 release?

Re: Apache and Apache Solr together

2018-06-07 Thread Shawn Heisey
On 6/6/2018 12:57 AM, azharuddin wrote: > I've got a question: I came across Apache Solr > as requirement for a module > I'm installing and even after reading the documentation on Apache Solr's > official homepage I'm still not sure whether Apache

Re: Solr start script

2018-06-07 Thread Shawn Heisey
On 6/7/2018 7:37 AM, Greenhorn Techie wrote: When the above settings are passed as part of start script, does that mean whenever a new collection is created, Solr is going to store the indexes in HDFS? But what if I upload my solrconfig.xml to ZK which contradicts with this and contains

Re: Running Solr on HDFS - Disk space

2018-06-07 Thread Shawn Heisey
On 6/7/2018 6:41 AM, Greenhorn Techie wrote: As HDFS has got its own replication mechanism, with a HDFS replication factor of 3, and then SolrCloud replication factor of 3, does that mean each document will probably have around 9 copies replicated underneath of HDFS? If so, is there a way to

Re: HDP Search - Configuration & Data Directories

2018-06-07 Thread Shawn Heisey
On 6/7/2018 6:35 AM, Greenhorn Techie wrote: A quick question on configuring Solr with Hortonworks HDP. I have installed HDP and then installed HDP Search using the steps described under the link - Within the various Solr config settings on Ambari, I am a bit confused on the role of

Re: Delete then re-add a core

2018-06-07 Thread Shawn Heisey
On 6/7/2018 4:12 AM, Amanda Shuman wrote: Definitely not a permissions problem - everything is run by the solr user, which owns everything in the directories. I just can't figure out why the default working directory is in opt rather than var (which is where it should be according to a previous

Re: Dataimport performance

2018-06-07 Thread Shawn Heisey
On 6/7/2018 12:19 AM, kotekaman wrote: sorry. may i know how to code it? Code *what*? Here's the same wiki page that I gave you for your last message: https://wiki.apache.org/solr/UsingMailingLists Even if I go to the Nabble website and discover that you've replied to a topic that's SEVEN

Re: Delta Import Configuration

2018-06-07 Thread Shawn Heisey
On 6/7/2018 12:22 AM, kotekaman wrote: Is the deltaimport should use the timestamp in sql table? The text above, and the subject, are the ONLY things I can see in this message.  Which makes this an extremely vague question.  This wiki page may be relevant:

Re: Issues in Solr-7.3

2018-06-07 Thread Shawn Heisey
On 6/6/2018 7:38 PM, tapan1707 wrote: We are planning to upgrade our Solr-6.4 to Solr-7.x. While considering the appropriate minor version, I saw that there are many ongoing issues for Solr-7.3 users on the mailing list. Just wanted to take an expert opinion if it's *safe* to just upgrade to 7.3

Re: Solr Default query parser

2018-06-06 Thread Shawn Heisey
On 6/6/2018 9:52 AM, Kamal Kishore Aggarwal wrote: >> What is the default query parser (QP) for solr. >> >> While I was reading about this, I came across two links which looks >> ambiguous to me. It's not clear to me whether Standard is the default QP or >> Lucene is the default QP or they are

Re: Solr 'healthcheck' command

2018-06-05 Thread Shawn Heisey
On 6/5/2018 11:22 PM, Zheng Lin Edwin Yeo wrote: For this clusterstatus, as we are still pointing it at the Solr directly http://localhost:8983/solr/admin/collections?action=CLUSTERSTATUS It is not likely to work if the main replica is down. Let's say I have 2 replica, one in localhost:8983,

Re: Windows monitoring software for Solr recommendation

2018-06-05 Thread Shawn Heisey
On 6/5/2018 10:26 PM, TK Solr wrote: I visualized the GC log with GCMV (GCVM?) and the graph shows Solr was using less than half of the heap space at the peak. This Solr doesn't get much query traffic and no indexing was running. It's really a sudden death of JVM with no trace. If you aren't

Re: Solr 'healthcheck' command

2018-06-05 Thread Shawn Heisey
On 6/5/2018 10:58 PM, Zheng Lin Edwin Yeo wrote: The healthcheck action in SolrCLI is able to return the health status of individual collection, while this http://host:port/solr/admin/info/system URL returns the overall health status of Solr. We will need the information on the health status of

Re: Solr 'healthcheck' command

2018-06-05 Thread Shawn Heisey
On 6/5/2018 9:43 PM, Zheng Lin Edwin Yeo wrote: I understand that we can do the health checking of the Solr by using the solr.cmd command under ./bin/solr, which is run from command prompt. Would like to check, is this feature available via URL from browser? I am using Solr 7.3.1. The

Re: sharding guidelines

2018-06-05 Thread Shawn Heisey
On 6/4/2018 4:36 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote: We have a collection (one shard, two replicas, currently running Solr6.6) which sometimes becomes unresponsive on the non-leader node. It is 214 gigabytes, and we were wondering whether there is a rule of thumb how large to allow a

Re: SolrCloud Collection Backup - Solr 5.5.4

2018-06-05 Thread Shawn Heisey
On 6/4/2018 5:36 AM, Greenhorn Techie wrote: 1. In the SolrCloud, as a single host can have information about multiple shards (either leader or replica), how does the backup API handle the underlying data copy? I presume it will simply copy the data across ALL the shards (both leader and

Re: Windows monitoring software for Solr recommendation

2018-06-05 Thread Shawn Heisey
On 6/5/2018 11:12 AM, TK Solr wrote: > My client's Solr 6.6 running on a Windows server is mysteriously > crashing without any JVM crash log. No unusual activities recorded in > solr.log. GC log does not indicate the OOM situation. It's a simple > single-core, single node deployment (no

Re: deleted master index files replica did not replicate

2018-06-04 Thread Shawn Heisey
On 6/4/2018 12:15 PM, Jeff Courtade wrote: > This was strange as I would have thought the replica would have replicated > an empty index from the master. Solr actually has protections in place to specifically PREVENT index replication when the master has an empty index.  This is so that a

Re: Mysterious Solr crash

2018-06-03 Thread Shawn Heisey
On 6/3/2018 7:52 AM, Nawab Zada Asad Iqbal wrote: I am running a batch indexing job and Solr core mysteriously shut down without any particular error. How can I investigate this? I am focusing on the line which mentions "Shutting down CoreContainer instance". There are errors soon after that,

Re: URL to call ZooKeeper instead of Solr directly in SolrCloud

2018-06-03 Thread Shawn Heisey
On 6/3/2018 10:44 AM, Zheng Lin Edwin Yeo wrote: I am running Solr in Cloud Mode, there is a fault tolerant ZK setup. I understand that we can use CloudSolrClient, and it will automatically adjust when servers go down. However, I would like to check if there is a way for this to work if we are

Re: Solr Cloud (6.6.3), Zookeeper(3.4.10) and ELB's

2018-06-02 Thread Shawn Heisey
On 6/2/2018 5:20 AM, solrnoobie wrote: Thank you for pointing out our error in having an ELB on top of a zookeeper. We did this so that we could recover a node if it goes down without the need to have a rolling restart of the solr nodes. I guess we will try an elastic IP instead because part of

Re: Solr Cloud (6.6.3), Zookeeper(3.4.10) and ELB's

2018-06-02 Thread Shawn Heisey
On 6/2/2018 1:49 AM, solrnoobie wrote: Our team is having problems with our production setup in AWS. Our current setup is: - Dockerized solr nodes behind an ELB Putting Solr behind a load balancer is a pretty normal thing to do. - zookeeper with exhibitor in a docker container (3 of this

Re: SolrCloud Collection Backup - Solr 5.5.4

2018-06-02 Thread Shawn Heisey
On 6/2/2018 1:50 AM, Shawn Heisey wrote: If you provide a location parameter, it will write a new backup directory in that location. https://lucene.apache.org/solr/guide/6_6/making-and-restoring-backups.html#standalone-mode-backups I verified that this parameter is in the 5.5 docs too, I would

Re: SolrCloud Collection Backup - Solr 5.5.4

2018-06-02 Thread Shawn Heisey
On 6/1/2018 7:23 AM, Greenhorn Techie wrote: > We are running SolrCloud with version 5.5.4. As I understand, Solr > Collection Backup and Restore API are only supported from version 6 > onwards. So wondering what is the best mechanism to get our collections > backed-up on older Solr version. That

Re: Self Signed Certificate for Load Balancer and Solr Nodes

2018-06-01 Thread Shawn Heisey
On 6/1/2018 2:01 PM, Kelly Rusk wrote: > We have solr1.com and solr2.com self-signed certs that correspond to the two > servers. We also have a load balancer with an address named solrlb.com. When > we hit the load balancer it gives us an SSL error, as it is passing us back > to either

Re: search q via dynamic string depends on date

2018-06-01 Thread Shawn Heisey
On 5/31/2018 7:19 AM, servus01 wrote: what i've got: xml file with a date/description fields which are not part of the index: (start-date-time="2018-04-01T18:00:00.000+02:00" code-name="MD 28") (start-date-time="2018-04-07T15:00:00.000+02:00" code-name="MD 29")

Re: Setting up Solr Replica on different machine

2018-06-01 Thread Shawn Heisey
On 5/31/2018 11:38 PM, Zheng Lin Edwin Yeo wrote: I am planning to set up Solr with replica on different machine. How should I go about configuring the setup? Like for example, should the replica node be started on the host machine, or on the replica machine? I will be setting this in Solr

Re: Pointing 3 Solr Servers to a 3-node Zookeeper Cluster

2018-05-31 Thread Shawn Heisey
On 5/31/2018 10:30 AM, THADC wrote: > I have a three-node zookeeper cluster running on ports 2181, 2182, and 2183. > I also am creating three solr server nodes (running as solr cloud > instances). I want the three solr nodes (on ports 7574, 8983, and 8990) to > be in that zookeeper cluster. Since

Re: No solr.log in solr cloud 7.3

2018-05-31 Thread Shawn Heisey
On 5/31/2018 7:04 AM, msaunier wrote: > wget > http://apache.mirrors.ovh.net/ftp.apache.org/dist/lucene/solr/6.6.1/solr-6.6.1.tgz > tar -xzf solr-*.tgz > /opt/solr-*/bin/install_solr_service.sh /opt/solr-*.tgz > /etc/init.d/solr stop > rm -f solr-*.tgz So you did use the service installer. > 2.

Re: SolrJ, CloudSolrClient and basic authentication

2018-05-31 Thread Shawn Heisey
On 5/31/2018 8:03 AM, Dimitris Kardarakos wrote: > Following the feedback in the "Index protected zip" thread, I am > trying to add documents to the index using SolrJ API. > > The server is in SolrCloud mode with BasicAuthPlugin for authentication. > > I have not managed to figure out how to pass

Re: No solr.log in solr cloud 7.3

2018-05-31 Thread Shawn Heisey
On 5/31/2018 1:49 AM, SAUNIER Maxence wrote: What procedure did you follow to install Solr? The procedure on the documentation to install SolR Cloud You're going to have to be a lot more specific.  The only documentation that I consider to be relevant for installing Solr is NOT on the

Re: No solr.log in solr cloud 7.3

2018-05-30 Thread Shawn Heisey
On 5/30/2018 8:40 AM, msaunier wrote: > Today, I don’t understand why, but I don’t have solr.log file. I have just: > > drwxr-xr-x 1 solr solr 84 mai 30 16:19 archived > > -rw-r--r-- 1 solr solr 891352 mai 30 16:29 solr-8983-console.log > > -rw-r--r-- 1 solr solr 74068 mai 30 16:34

Re: Solr Cloud 7.3.1 backups

2018-05-30 Thread Shawn Heisey
On 5/29/2018 3:01 PM, Greg Roodt wrote: > What is the best way to perform a backup of a Solr Cloud cluster? Is there > a way to backup only the leader? From my tests with the collections admin > BACKUP command, all nodes in the cluster need to have access to a shared > filesystem. Surely that

Re: Disadvantages of having Zookeeper instance and Solr instance in the same server

2018-05-30 Thread Shawn Heisey
On 5/29/2018 11:27 PM, solr2020 wrote: What is the pros and cons of having Zookeeper instance and Solr instance in the same VM/Server in production environment? If you have sufficient CPU, memory, and I/O resources on the system for both roles, there is no problem with putting both on the

Re: sending empty request parameters to solr

2018-05-29 Thread Shawn Heisey
On 5/29/2018 5:10 AM, Riyaz wrote: We had come across a requirement to allow empty parameter values to query string(q), start and rows as part of solr search query. In solr 3.4, have added defType to edismax and it's allowing empty params http:///solr//select?=xml=true==" -->working fine in

Re: Index protected zip

2018-05-26 Thread Shawn Heisey
On 5/26/2018 4:52 AM, Tim Allison wrote: Please see Erick Erickson’s evergreen advice and linked blog post: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201805.mbox/%3ccan4yxve_0gn0a1y7wjpr27inuddo6+jzwwfgvzkfs40gh3r...@mail.gmail.com%3e The "don't use ERH in production"

Re: Different docs order in different replicas of the same shard

2018-05-25 Thread Shawn Heisey
On 5/25/2018 11:07 AM, SOLR4189 wrote: > You are right, BUT I have two indexers (one in WCF service and one in HADOOP) > and in two my indexers I'm using atomic updates in each document. According > to Atomic Update Processor Factory >

Re: Different docs order in different replicas of the same shard

2018-05-25 Thread Shawn Heisey
On 5/25/2018 7:28 AM, SOLR4189 wrote: > I use SOLR-6.5.1 and I want to start to use replicas. > > For it I want to understand something: > > 1) Can asynchronous forwarding document from leader to all replicas or some > another reasons cause that replica A may see update X then Y, and replica B >

Re: Could not load collection from ZK:

2018-05-24 Thread Shawn Heisey
On 6/20/2017 9:46 AM, Aman Deep Singh wrote: > Sorry Shawn, > It didn't copy entire stacktrace I put the stacktrace at > https://www.dropbox.com/s/zf8b87m24ei2ils/solr%20exception2?dl=0 > > Note: I have shaded the solr library under com.gdn.solr620 so all solr > class will be appear as

Re: Solaris 10

2018-05-24 Thread Shawn Heisey
On 5/24/2018 3:40 AM, Takuya Kawasaki wrote: Please let me ask a question. I would like to use Solr on Solaris 10. But I encountered a lot of errors. First, I can’t install solr using install script in .tgz. script result shows I have to install manually not using the script. Second, I can’t

Re: Unable to make IN queries on a particular field in solr

2018-05-23 Thread Shawn Heisey
On 5/23/2018 5:40 PM, RAUNAK AGRAWAL wrote: I am facing an issue where I have a collection named employee collection. Suppose I was to search employee by its id, so my query is *id:(1 2 3*) and it is working fine in solr. Now let say I want to search by their name. So my query is name:(Alice

Re: Trying to update Solrj in our app...

2018-05-23 Thread Shawn Heisey
On 5/23/2018 11:46 AM, BlackIce wrote: > Is there a list of things that have been deprecated in solr since 5.0.0? Or > do I have to read EVERY release readme till I get to 7.3.1? The javadoc for each release has pages that list all deprecations in that release on a per-module basis, so there are

Re: Zookeeper 3.4.12 with Solr 6.6.2?

2018-05-23 Thread Shawn Heisey
On 5/22/2018 10:44 AM, Walter Underwood wrote: > Is anybody running Zookeeper 3.4.12 with Solr 6.6.2? Is that a recommended > combination? Not recommended? Solr 6.6.2 shipped with ZK 3.4.10, which the ZK project released 2017-Mar-30. I asked the zk mailing list about any gotchas they're aware

Re: Solr Dates TimeZone

2018-05-23 Thread Shawn Heisey
On 5/22/2018 9:26 AM, LOPEZ-CORTES Mariano-ext wrote: > It's possible to configure Solr with a timezone other than GMT? No, at least not in the way that you're thinking. > It's possible to configure Solr Admin to view dates with a timezone other > than GMT? As far as I know, this is not

Re: Index filename while indexing JSON file

2018-05-23 Thread Shawn Heisey
On 5/18/2018 1:47 PM, S.Ashwath wrote: > I have 2 directories: 1 with txt files and the other with corresponding > JSON (metadata) files (around 9 of each). There is one JSON file for > each CSV file, and they share the same name (they don't share any other > fields). > > The txt files just

Re: Trying to update Solrj in our app...

2018-05-23 Thread Shawn Heisey
On 5/23/2018 7:25 AM, BlackIce wrote: I've got an app here that posts data to Solr using Solrj... I'm trying to update all our apps dependencies, and now I've reached Solrj Last kown working version is 5.5.0, anything after that dies at compile time with: if (val instanceof Date) { val2

Re: deletebyQuery vs deletebyId

2018-05-22 Thread Shawn Heisey
On 5/22/2018 6:35 PM, Jay Potharaju wrote: I have a quick question about deletebyQuery vs deleteById. When using deleteByQuery, if query is id:123 is that same as deleteById in terms of performance. If there is absolutely nothing else happening to update the index, the difference between the

Re: Navigating through Solr Source Code

2018-05-21 Thread Shawn Heisey
On 5/21/2018 4:35 AM, Greenhorn Techie wrote: As the documentation around Solr is limited, I am thinking to go through the source code and understand the various bits and pieces. However, I am a bit confused on where to start as I my developing skills are a bit limited. Any thoughts on how best

Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Shawn Heisey
On 5/21/2018 8:25 AM, Kelly, Frank wrote: We have an indexing heavy workload (we do more indexing than searching) and for those searches we do perform we have very few cache hits (25% of our index is in memory and the hit rate is < 0.1%) Which cache are you looking at for that hitrate?  How

Re: Used debug log level on the interface

2018-05-17 Thread Shawn Heisey
On 5/17/2018 3:03 AM, msaunier wrote: On solrCloud interface, I don't have with solr4j the info and debug level on the console. In < level > I have add my URP with INFO param and DEBUG param but never of the two work. I have just WARN and ERROR log on the interface. The admin UI won't show

Re: Default Searches not working after migrating from Solr 4.7 to 7.3

2018-05-17 Thread Shawn Heisey
On 5/17/2018 7:23 AM, THADC wrote: , however for 7.3, "defaultSearchField" apparently no longer a valid type. I switched to "df". Also, "text" is no longer default data type, but rather "_text_". So, I replaced above with: _text_ , but still default search not working properly. By the way,

Re: Question regarding TLS version for solr

2018-05-17 Thread Shawn Heisey
On 5/17/2018 1:53 AM, Anchal Sharma2 wrote: We are using solr version 5.3.0 and have been trying to enable security on our solr .We followed steps mentioned on site -https://lucene.apache.org/solr/guide/6_6/enabling-ssl.html .But by default it picks ,TLS version 1.0,which is causing an

Re: SOLR ISSUE

2018-05-16 Thread Shawn Heisey
On 5/16/2018 9:27 AM, Shah, Rimple (LNG-RDU) wrote: > https://lucene.apache.org/solr/guide/7_2/aws-solrcloud-tutorial.html > I am trying to follow these instructions for running SOLR on EC2. Somehow I > am getting this error each and every time when I try to access the dashboard. > Is anyone

Re: Solr CPU usage

2018-05-16 Thread Shawn Heisey
On 5/16/2018 7:11 AM, Александр Шестак wrote: Hi, I have a question about unpredictable CPU usage by solr. We have recently migrated our application from Solr 4.6.1 to Solr 7.1.0. We use master/slave approach. And now we have noticed that CPU usage of master/slave in passive state (no

Re: question about updates to shard leaders only

2018-05-15 Thread Shawn Heisey
On 5/15/2018 12:12 AM, Bernd Fehling wrote: OK, I have the CloudSolrClient with SolrJ now running but it seams a bit slower compared to ConcurrentUpdateSolrClient. This was not expected. The logs show that CloudSolrClient send the docs only to the leaders. So the only advantage of

Re: Techniques for Retrieving Hits

2018-05-14 Thread Shawn Heisey
On 5/14/2018 3:13 PM, Terry Steichen wrote: > I posted this note because I've not seen list comments pertaining to the > job of actually locating and retrieving hitlist documents. How documents are retrieved will be highly dependent on your setup.  Here's how things usually go: If the original

Re: Commit too slow?

2018-05-14 Thread Shawn Heisey
On 5/14/2018 11:29 AM, LOPEZ-CORTES Mariano-ext wrote: > After having injecting 200 documents in our Solr server, the commit > operation at the end of the process (using ConcurrentUpdateSolrClient) take > 10 minutes. It's too slow? There is a wiki page discussing slow commits:

Re: Techniques for Retrieving Hits

2018-05-14 Thread Shawn Heisey
On 5/14/2018 6:46 AM, Terry Steichen wrote: In order to allow users to retrieve the documents that match a query, I make use of the embedded Jetty container to provide file server functionality.  To make this happen, I provide a symbolic link between the actual document archive, and the Jetty

Re: ZKPropertiesWriter Could not read DIH properties

2018-05-11 Thread Shawn Heisey
On 5/11/2018 11:20 AM, tayitu wrote: > I am using Solr 6.6.0. I have created collection and uploaded the config > files to zookeeper. I can see the collection and config files from Solr > Admin UI. When I try to Dataimport, I get the following error: > > ZKPropertiesWriter Could not read DIH

Re: Performance if there is a large number of field

2018-05-11 Thread Shawn Heisey
On 5/11/2018 9:26 AM, Andy C wrote: > Why are range searches more efficient than wildcard searches? I guess I > would have expected that they just provide different mechanism for defining > the range of unique terms that are of interest, and that the merge > processing would be identical. I hope

Re: Solr soft commits

2018-05-11 Thread Shawn Heisey
On 5/10/2018 8:28 PM, Shivam Omar wrote: Thanks Shawn, So there are cases when soft commit will not be faster than the hard commit with openSearcher=true. We have a case where we have to do bulk deletions in that case will soft commit be faster than hard commits. I actually have no idea

Re: Performance if there is a large number of field

2018-05-11 Thread Shawn Heisey
On 5/10/2018 2:22 PM, Deepak Goel wrote: Are there any benchmarks for this approach? If not, I can give it a spin. Also wondering if there are any alternative approach (i guess lucene stores data in a inverted field format) Here is the only other query I know of that can find documents missing

Re: Performance if there is a large number of field

2018-05-10 Thread Shawn Heisey
On 5/10/2018 11:49 AM, Deepak Goel wrote: Sorry but I am unclear about - "What if there is no default value and the field does not contain anything"? What does Solr pass on to Lucene? Or is the field itself omitted from the document? If there is no default value and the field doesn't exist in

Re: Performance if there is a large number of field

2018-05-10 Thread Shawn Heisey
On 5/10/2018 10:58 AM, Deepak Goel wrote: I wonder what does Solr stores in the document for fields which are not being used. And if the queries have a performance difference https://lucene.apache.org/solr/guide/6_6/defining-fields.html (A default value that will be added automatically to any

Re: Solr soft commits

2018-05-10 Thread Shawn Heisey
On 5/10/2018 9:48 AM, Shivam Omar wrote: I need some help in understanding solr soft commits. As soft commits are about visibility and are fast in nature. They are advised for nrt use cases. Soft commits *MIGHT* be faster than hard commits.  There are situations where the performance of a

Re: How to replacing values on multiValued all together by using 1 query

2018-05-10 Thread Shawn Heisey
On 5/10/2018 7:51 AM, Issei Nishigata wrote: I create a field called employee_name, and use it as multiValued. If “Mr.Smith" that is part of the value of the field is changed to “Mr.Brown", do I have to create 1 million deletion queries and updating queries in case where “Mr.Smith" appears in 1

<    5   6   7   8   9   10   11   12   13   14   >