Re: Solr8.7 - How to optmize my index ?

2020-12-02 Thread Dave
I’m going to go against the advice SLIGHTLY, it really depends on how you have things set up as far as your solr server hosting is done. If you’re searching off the same solr server you’re indexing to, yeah don’t ever optimize it will take care of itself, people much smarter than us, like

Re: Recovering deleted files without backup

2020-11-13 Thread Dave
Just rebuild the index. Pretty sure they’re gone if they aren’t in your vm backup, and solr isn’t a document storage tool, it’s a place to index the data from your document store, so it’s understood more or less that it can always be rebuilt when needed > On Nov 13, 2020, at 9:52 PM, Alex

Re: Need help to resolve Apache Solr vulnerability

2020-11-12 Thread Dave
Solr isn’t meant to be public facing. Not sure how anyone would send these commands since it can’t be reached from the outside world > On Nov 12, 2020, at 7:12 AM, Sheikh, Wasim A. > wrote: > > Hi Team, > > Currently we are facing the below vulnerability for Apache Solr tool. So can > you

Re: Avoiding single digit and single charcater ONLY query by putting them in stopwords list

2020-10-27 Thread Dave
Agreed. Just a JavaScript check on the input box would work fine for 99% of cases, unless something automatic is running them in which case just server side redirect back to the form. > On Oct 27, 2020, at 11:54 AM, Mark Robinson wrote: > > Hi Konstantinos , > > Thanks for the reply. > I

Re: Solr endpoint on the public internet

2020-10-08 Thread Dave
#1. This is a HORRIBLE IDEA #2 If I was going to do this I would destroy the update request handler as well as the entire admin ui from the solr instance, set up a replication from a secure solr instance on an interval. This way no one could send an update /delete command, you could still

Re: solr startup

2020-08-08 Thread Dave
e.org > Subject: RE: solr startup > > suggester? what do i need to look for in the configs? > > Tony > > > > Sent from my Verizon, Samsung Galaxy smartphone > > > > Original message > From: Dave mailto:hastings.recurs...@gmail

Re: solr startup

2020-08-07 Thread Dave
It sounds like you have suggester indexes being built on startup. Without them they just come up in a second or so > On Aug 7, 2020, at 6:03 PM, Schwartz, Tony wrote: > > I have many collections. When I start solr, it takes 30 - 45 minutes to > start up and load all the collections. My

Re: sorting help

2020-07-15 Thread Dave
That’s a good place to start. The idea was to make sure titles that started with a date would not always be at the forefront and the actual title of the doc would be sorted. > On Jul 15, 2020, at 4:58 PM, Erick Erickson wrote: > > Yeah, it’s always a question “how much is enough/too much”.

Re: ***URGENT***Re: Questions about Solr Search

2020-07-03 Thread Dave
Seriously. Doug answered all of your questions. > On Jul 3, 2020, at 6:12 AM, Atri Sharma wrote: > > Please do not cross post. I believe your questions were already answered? > >> On Fri, Jul 3, 2020 at 3:08 PM Gautam K wrote: >> >> Since it's a bit of an urgent request so if could please

Re: Getting rid of zookeeper

2020-06-09 Thread Dave
Is it horrible that I’m already burnt out from just reading that? I’m going to stick to the classic solr master slave set up for the foreseeable future, at least that let’s me focus more on the search theory rather than the back end system non stop. > On Jun 9, 2020, at 5:11 PM, Vincenzo

Re: How to determine why solr stops running?

2020-06-09 Thread Dave
I’ll add that whenever I’ve had a solr instance shut down, for me it’s been a hardware failure. Either the ram or the disk got a “glitch” and both of these are relatively fragile and wear and tear type parts of the machine, and should be expected to fail and be replaced from time to time. Solr

Re: Script to check if solr is running

2020-06-08 Thread Dave
A simple Perl script would be able to cover this, I have a cron job Perl script that does a search with an expected result, if the result isn’t there it fails over to a backup search server, sends me an email, and I fix what’s wrong. The backup search server is a direct clone of the live server

Re: Upgrading Solrcloud indexes from 7.2 to 8.4.1

2020-03-06 Thread Dave
You best off doing a full reindex to a single solr cloud 8.x node and then when done start taking down 7.x nodes, upgrade them to 8.x and add them to the new cluster. upgrading indexes has so many potential issues, > On Mar 6, 2020, at 9:21 PM, lstusr 5u93n4 wrote: > > Hi Webster, > > When

Re: Clarity on Stable Release

2020-01-29 Thread Dave
But! If we don’t have people throwing a new release into production and finding real world problems we can’t trust that the current release problems will be exposed and then remedied, so it’s a double edged sword. I personally agree with staying a major version back, but that’s because it

Re: Solr cloud production set up

2020-01-18 Thread Dave
If you’re not getting values, don’t ask for the facet. Facets are expensive as hell, maybe you should think more about your query’s than your infrastructure, solr cloud won’t help you at all especially if your asking for things you don’t need > On Jan 18, 2020, at 1:25 PM, Rajdeep Sahoo

Re: Solr cloud production set up

2020-01-18 Thread Dave
Agreed with the above. what’s your idea of “huge”? I have 600 ish gb in one core plus another 250x2 in two more on the same standalone solr instance and it runs more than fine > On Jan 18, 2020, at 11:31 AM, Shawn Heisey wrote: > > On 1/18/2020 1:05 AM, Rajdeep Sahoo wrote: >> Our Index size

Re: Failed to connect to server

2020-01-17 Thread Dave
It doesn’t need to be identical, just anything with a buildon reload statement > On Jan 17, 2020, at 12:17 PM, rhys J wrote: > > On Fri, Jan 17, 2020 at 12:10 PM David Hastings < > hastings.recurs...@gmail.com> wrote: > >> something like this in your solr config: >> >> autosuggest >

Re: Solr 7.5 speed up, accuracy details

2019-12-28 Thread Dave
There is no increase in speed, but features. Doc values add some but it’s hard to quantify, and some people think solr cloud has speed increases but I don’t think they exist when hardware cost is nonexistent and it adds too much complexity to something that should be simple. > On Dec 28,

Re: does copyFields increase indexe size ?

2019-12-25 Thread Dave
#1 merry Xmas thing #2 you initially said you were talking about 1k documents. That will not be a large enough sample size to see the index size differences with this new field, in any case the index size should never really matter. But if you go to a few million you will notice the size has

Re: Indexing strategies for user profiles

2019-12-10 Thread Dave
I would index the products a user purchased as well as the number of times purchased, then I would take a user, search their bought products boosted by how many times purchased, against other users, have a facet for products and filter out the top bought products that are not on the users

Re: How to add a new field to already an existing index in Solr 6.6 ?

2019-12-08 Thread Dave
Or just do it the lazy way and use a dynamic field. I’ve found little to no drawbacks with them aside from a complete lack of documentation of the field in the schema itself > On Dec 8, 2019, at 8:07 AM, David Barnett wrote: > > Also - look at adding fields using Solr admin, this will these

Re: xms/xmx choices

2019-12-06 Thread Dave
Actually at about that time the replication finished and added about 20-30gb to the index from the master. My current set up goes Indexing master -> indexer slave/production master (only replicated on command)-> three search slaves (replicate each 15 minutes) We added about 2.3m docs, then I

Re: Is it possible to have different Stop words depending on the value of a field?

2019-12-02 Thread Dave
.org If you can, please build on your explanation as It > sounds relevant. > -Original Message- > From: Dave > Sent: Monday, December 2, 2019 7:38 PM > To: solr-user@lucene.apache.org > Cc: jornfra...@gmail.com > Subject: Re: Is it possible to have different

Re: Is it possible to have different Stop words depending on the value of a field?

2019-12-02 Thread Dave
It clarifies yes. You need new fields. In this case something like Address_us Address_uk And index and search them accordingly with different stopword files used in different field types, hence the copy field from “address” into as many new fields as needed > On Dec 2, 2019, at 7:33 PM,

Re: A Last Message to the Solr Users

2019-11-30 Thread Dave
I’m young here I think, not even 40 and only been using solr since like 2008 or so, so like 1.4 give or take. But I know a really good therapist if you want to talk about it. > On Nov 30, 2019, at 6:56 PM, Mark Miller wrote: > > Now I have sacrificed to give you a new chance. A little for my

Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Dave
https://doc.sitecore.com/developers/90/platform-administration-and-architecture/en/using-solr-auto-suggest.html If you need more references. Set all parameters yourself, don’t rely on defaults. > On Nov 21, 2019, at 3:41 PM, Dave wrote: > > https://lucidworks.com/post/solr-

Re: Solr process takes several minutes before accepting commands after restart

2019-11-21 Thread Dave
https://lucidworks.com/post/solr-suggester/ You must set buildonstartup to false, the default is true. Try it > On Nov 21, 2019, at 3:21 PM, Koen De Groote > wrote: > > Erick: > > No suggesters. There is 1 spellchecker for > > text_general > > But no buildOnCommit or buildOnStartup setting

Re: Active directory integration in Solr

2019-11-20 Thread Dave
I guess I don’t understand why one wouldn’t simply make a basic front end for solr, it’s literally the easiest thing to throw together and then you control all authentication and filters per user. Even a basic one would be some w3 school tutorials with php+json+whatever authentication Mech you

Re: POS Tagger

2019-10-25 Thread Dave
teh query >> >> On Fri, Oct 25, 2019 at 12:11 PM Audrey Lorberfeld - >> audrey.lorberf...@ibm.com wrote: >> >>> So then you do run your POS tagger at query-time, Dave? >>> >>> -- >>> Audrey Lorberfeld >>> Data Scientist, w3 Search >

Re: Sample JWT Solr configuration

2019-09-19 Thread Dave
I know this has nothing to do with the issue at hand but if you have a public facing solr instance you have much bigger issues. > On Sep 19, 2019, at 10:16 PM, Tyrone Tse wrote: > > I finally got JWT Authentication working on Solr 8.1.1. > This is my security.json file contents > { >

Re: Need more info on MLT (More Like This) feature

2019-09-13 Thread Dave
As a side note, if you use shingles with the mlt handler I believe you will get better scores/relevant results. So “to be free” becomes indexes as “to_be” “to_be_free” and “be_free” but also as each word. It makes the index significantly larger but creates better “unique terms” in my opinion

Re: Solr 7.7.2 Autoscaling policy - Poor performance

2019-09-03 Thread Dave
You’re going to want to start by having more than 3gb for memory in my opinion but the rest of your set up is more complex than I’ve dealt with. On Sep 3, 2019, at 1:10 PM, Andrew Kettmann wrote: >> How many zookeepers do you have? How many collections? What is there size? >> How much CPU /

Re: Best way to retrieve parent documents with children using getBeans method?

2019-08-12 Thread Dave Durbin
Unsubscribe -- *P.S. We've launched a new blog to share the latest ideas and case studies from our team. Check it out here: product.canva.com . *** ** Empowering the world to design Also, we're hiring. Apply here!

Sort date stored in text field?

2019-06-10 Thread Dave Beckstrom
Hi Everyone, Running SOLR 7.3.1 I have a field called metatag.date that is field-type: org.apache.solr.schema.TextFieldThe field is being populated by NUTCH, which grabs the date from the html: and stores it in the metatag.date field in SOLR. I'm trying to sort by date (metatag.date

Re: Using Solr as a Database?

2019-06-02 Thread Dave
You *can use solr as a database, in the same sense that you *can use a chainsaw to remodel your bathroom. Is it the right tool for the job? No. Can you make it work? Yes. As for HA and cluster rdbms gallera cluster works great for Maria db, and is acid compliant. I’m sure any other database

Re: SOLR Text Field

2019-04-08 Thread Dave Beckstrom
purposes only. That would have been most helpful. Even a FAQ somewhere would have been helpful. Anyway, you're the best and thank you again Best, Dave Beckstrom -- *Fig Leaf Software, Inc.*  https://www.figleaf.com/ <https://www.figleaf.com/>   Full-Service Solutions Integrator

Re: SOLR Text Field

2019-04-06 Thread Dave
Wow. Ok dude relax and take a nap. It sounds like you don’t even have a core defined. Maybe you’d do and I’m reaching a bit but start there solr is super simple and only gets complicated when you’re complicated. > On Apr 6, 2019, at 8:59 AM, Dave Beckstrom wrote: > > Hi Everyone,

SOLR Text Field

2019-04-06 Thread Dave Beckstrom
Hi Everyone, I'm really hating SOLR. All I want is to define a text field that data can be indexed into and which is searchable. Should be super simple. But I run into issue after issue. I'm running SOLR 7.3 because it's compatible with the version of NUTCH I'm running. The docs say that

Error on text field

2019-03-26 Thread Dave Beckstrom
Hi Everyone, I'm using Nutch to crawl and index some content. It failed on a SOLR field defined as a text field when it was trying to insert the following value for the field: 33011-54192-EWHServer1234-3BA9D1CA-05B6-42BA-9D88-BAD970CAEEC6 The field was defined in the schema.xml as: The

Solr 7.6 Shard name - possible issue?

2019-03-17 Thread Dave Durbin
-name-2018-10-30_shard1_2_replica_n1 I have a couple who’s replica number exceeds a couple of hundred. collection-name-2018-10-30_shard2_1_replica_n213 Does this seem reasonable? Does it suggest a problem with these shard replicas or this shard in general ? Thanks Dave -- *P.S. We've launched

Boolean Searches?

2019-03-14 Thread Dave Beckstrom
Hi Everyone, I'm building a SOLR search application and the customer wants the search to work like google search. They want the user to be able to enter boolean searches like: train OR dragon. which would find any matches that has the word "train" or the word "dragon" in the title. I know

Re: MLT and facetting

2019-02-28 Thread Dave
I’m more curious what you’d expect to see, and what possible benefit you could get from it > On Feb 28, 2019, at 8:48 PM, Zheng Lin Edwin Yeo wrote: > > Hi Martin, > > I have no idea on this, as the case has not been active for almost 2 years. > Maybe I can try to follow up. > > Faceting by

Re: MLT and facetting

2019-02-25 Thread Dave
Use the mlt to get the queries to use for getting facets in a two search approach > On Feb 25, 2019, at 10:18 PM, Zheng Lin Edwin Yeo > wrote: > > Hi Martin, > > I think there are some pictures which are not being sent through in the > email. > > Do send your query that you are using, and

Re: edismax: sorting on numeric fields

2019-02-16 Thread Dave
Sounds like you need to use code and post process your results as it sounds too specific to your use case. Just my opinion, unless you want to get into spacial queries which is a whole different animal and something I don’t think many have experience with, including myself > On Feb 16, 2019,

Re: English Analyzer

2019-02-05 Thread Dave
This will tell you pretty everything you need to get started https://lucene.apache.org/solr/guide/6_6/language-analysis.html > On Feb 5, 2019, at 4:55 AM, akash jayaweera wrote: > > Hello All, > > Can i get details how to use English analyzer with stemming, > lemmatizatiion, stopword removal

Re: Large Number of Collections takes down Solr 7.3

2019-01-22 Thread Dave
Do you mind if I ask why so many collections rather than a field in one collection that you can apply a filter query to each customer to restrict the result set, assuming you’re the one controlling the middle ware? > On Jan 22, 2019, at 4:43 PM, Monica Skidmore > wrote: > > We have been

Re: Solr Cloud configuration

2018-11-20 Thread Dave
But then I would lose the steaming expressions right? > On Nov 20, 2018, at 6:00 PM, Edward Ribeiro wrote: > > Hi David, > > Well, as a last resort you can resort to classic schema.xml if you are > using standalone Solr and don't bother to give up schema API. Then you are > back to manually

Re: Index optimization takes too long

2018-11-03 Thread Dave
On a side note, does adding docvalues to an already indexed field, and then optimizing, prevent the need to reindex to take advantage of docvalues? I was under the impression you had to reindex the content. > On Nov 3, 2018, at 4:41 AM, Deepak Goel wrote: > > I would start by monitoring the

7.3 to 7.5

2018-10-18 Thread Dave
Would a minor solr upgrade such as this require a reindexing in order to take advantage of the skg functionality, or would it work regardless? A full reindex is quite a large operation in my use case

cursorMark and sort order

2018-07-25 Thread Dave Durbin
with just : sort = asc and have Solr understand that the sort is only for tie break purposes? Thanks Dave -- *P.S. We've launched a new blog to share the latest ideas and case studies from our team. Check it out here: product.canva.com <http://product.canva.com/>. *** **

Re: Sorting on ip address

2018-06-18 Thread Dave
Store it as an atom rather than an up address. > On Jun 18, 2018, at 12:14 PM, root23 wrote: > > Hi all, > is there a built in data type which i can use for ip address which can > provide me sorting ip address based on the class? if not then what is the > best way to sort based on ip address

Re: Scaling issue with Solr

2017-12-27 Thread Dave
You may find that buying some more memory will be your best bang for the buck in your set up. 32-64 gb isn’t expensive, > On Dec 27, 2017, at 6:57 PM, Suresh Pendap wrote: > > What is the downside of configuring ramBufferSizeMB to be equal to 5GB ? > Is it only that

Re: Migrating from Solr 6.X to Solr 7.X: "non legacy mode coreNodeName missing"

2017-10-30 Thread Dave Seltzer
Thanks Erick, I've looked over the documentation. Quick follow-up question: What are the consequences of running with legacyCloud=true? Would I need to point a new Solr cluster at a new Zookeeper instance to avoid this? Many thanks! -Dave On Mon, Oct 30, 2017 at 1:36 PM, Erick Erickson

Re: Migrating from Solr 6.X to Solr 7.X: "non legacy mode coreNodeName missing"

2017-10-30 Thread Dave Seltzer
indicate the state of the cluster. Does that mean I'm using the "zookeeper is the truth" system or the old system? Thanks! -Dave On Mon, Oct 30, 2017 at 11:55 AM, Erick Erickson <erickerick...@gmail.com> wrote: > You may have to set legacyCloud=true in your cluster propertie

Migrating from Solr 6.X to Solr 7.X: "non legacy mode coreNodeName missing"

2017-10-30 Thread Dave Seltzer
acy mode coreNodeName missing {collection.configName=content, shard=shard1, collection=content_collection_20171013} Is there something I have to do to prepare this collection for Solr 7.x? Thanks, -Dave [root@crompcoreph02 ~]# curl "http:// [solrclusterloadbalancer]/solr/admin/collection

Re: Semantic Knowledge Graph

2017-10-09 Thread Dave
ppose any one knows where i may be able to find them, or point me in >> a direction to get more information about this tool. >> >> Thanks - dave >>

Re: length of indexed value

2017-10-03 Thread Dave
I’d personally use your second option. Simple and straightforward if you can afford the time for a reindex > On Oct 3, 2017, at 6:23 PM, John Blythe wrote: > > hey all. > > was hoping to find a query function that would allow me to filter based on > the length of an

Re: Performance Test

2017-09-04 Thread Dave
Get the raw logs from normal use, script out something to replicate the searches and have it fork to as many cores as the solr server has is what I'd do. > On Sep 4, 2017, at 5:26 AM, Daniel Ortega wrote: > > I would recommend you Solrmeter cloud > > This fork

Re: query with wild card with AND taking lot of time

2017-09-03 Thread Dave
My other concern would be your p's and q's. If you start mixing in Boolean logic and solrs weak respect for it, it could be unpredictable > On Sep 3, 2017, at 5:43 PM, Phil Scadden wrote: > > 5 seems a reasonable limit to me. After that revert to slow. > > -Original

Re: Different order of docs between SOLR-4.10.4 to SOLR-6.5.1

2017-08-13 Thread Dave
Rebuild your index. It's just the safest way. On Aug 13, 2017, at 2:02 PM, SOLR4189 wrote: >> If you are changing things like WordDelimiterFilterFactory to the graph >> version, you'll definitely want to reindex > > What does it mean "*want to reindex*"? If I change >

Re: MongoDb vs Solr

2017-08-12 Thread Dave
Personally I say use a rdbms for data storage, it's what it's for. Solr is for search and retrieve and the expense of possible loss of all data, in which case you rebuild it. > On Aug 12, 2017, at 11:26 AM, Muwonge Ronald wrote: > > Hi Solr can use mongodb for storage and

Re: Fetch a binary field

2017-08-11 Thread Dave
Why didn't you set it to be indexed? Sure it would be a small dent in an index > On Aug 11, 2017, at 5:20 PM, Barbet Alain wrote: > > Re, > I take a look on the source code where this msg happen >

Re: Need help with query syntax

2017-08-10 Thread Dave
Eric you going to vegas next month? > On Aug 10, 2017, at 7:38 PM, Erick Erickson wrote: > > Omer: > > Solr does not implement pure boolean logic, see: > https://lucidworks.com/2011/12/28/why-not-and-or-and-not/. > > With appropriate parentheses it can give the same

Re: MongoDb vs Solr

2017-08-05 Thread Dave
to the mailing list that's supposed to serve as a source of help, which, you asked for. > On Aug 5, 2017, at 7:54 AM, Dave <hastings.recurs...@gmail.com> wrote: > > Also I wouldn't really recommend mongodb at all, it should only to be used as > a fast front end to an acid compliant

Re: MongoDb vs Solr

2017-08-05 Thread Dave
>> >>>> wunder >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> http://observer.wunderwood.org/ (my blog) >>>> >>>> >>>>> On Aug 4, 2017, at 8:13 PM, David Hastings <dhasti...@wshein.com>

Re: MongoDb vs Solr

2017-08-05 Thread Dave
Also, id love to see an example of a many to many relationship in a nosql db >> as you described, since that's a rdbms concept. If it exists in a nosql >> environment I would like to learn how... >> >>> On Aug 4, 2017, at 10:56 PM, Dave <hastings.recurs...@gmail.com&

Re: MongoDb vs Solr

2017-08-04 Thread Dave
Uhm. Dude are you drinking? 1. Lucidworks would never say that. 2. Maria is not a json +MySQL. Maria is a fork of the last open source version of MySQL before oracle bought them 3.walter is 100% correct. Solr is search. The only complex data structure it has is an array. Something like mongo

Re: MongoDb vs Solr

2017-08-04 Thread Dave
Ones a search engine and the other is a nosql db. They're nothing alike and are completely different tools for completely different jobs. > On Aug 4, 2017, at 7:16 PM, Francesco Viscomi wrote: > > Hi all, > why i have to choose solr if mongoDb is easier to learn and to

Re: Move index directory to another partition

2017-08-01 Thread Dave
To add to this, not sure of solr cloud uses it, but you're going to want to destroy the wrote.lock file as well > On Aug 1, 2017, at 9:31 PM, Shawn Heisey wrote: > >> On 8/1/2017 7:09 PM, Erick Erickson wrote: >> WARNING: what I currently understand about the limitations

Re: solr cloud vs standalone solr

2017-07-29 Thread Dave
There is no solid rule. Honestly stand alone solr can handle quite a bit, I don't think there's a valid reason to go to cloud unless you are starting from scratch and want to use the newest buzz word, stand alone can handle well over half a terabyte index at sub second speeds all day long. >

Re: Network segmentation of replica

2017-07-06 Thread Dave
Sorry that should have read have not tested in solr cloud. > On Jul 6, 2017, at 6:37 PM, Dave <hastings.recurs...@gmail.com> wrote: > > I have tested that out in solr cloud, but for solr master slave replication > the config sets will not go without a reload,

Re: Network segmentation of replica

2017-07-06 Thread Dave
I have tested that out in solr cloud, but for solr master slave replication the config sets will not go without a reload, even if specified in the in the slave settings. > On Jul 6, 2017, at 5:56 PM, Erick Erickson wrote: > > I'm not entirely sure what happens if the

Re: Solr Web Crawler - Robots.txt

2017-06-01 Thread Dave
And I mean that in the context of stealing content from sites that explicitly declare they don't want to be crawled. Robots.txt is to be followed. > On Jun 1, 2017, at 5:31 PM, David Choi wrote: > > Hello, > > I was wondering if anyone could guide me on how to crawl

Re: Solr Web Crawler - Robots.txt

2017-06-01 Thread Dave
If you are not capable of even writing your own indexing code, let alone crawler, I would prefer that you just stop now. No one is going to help you with this request, at least I'd hope not. > On Jun 1, 2017, at 5:31 PM, David Choi wrote: > > Hello, > > I was

Re: Solr in NAS or Network Shared Drive

2017-05-26 Thread Dave
This could be useful in a space expensive situation, although the reason I wanted to try it is multiple solr instances in one server reading one index on the ssd. This use case where on the nfs still leads to a single point of failure situation on one of the most fragile parts of a server, the

Re: Best practices for backup & restore

2017-05-16 Thread Dave
I think it's depends what you are backing up and restoring from. Hardware failure? Accidental delete? For my use case my master indexer stores the index on a San with daily snap shots for reliability, then my live searching master is on a San as well, my live slave searchers are all on SSD

Re: SOLR as nosql database store

2017-05-08 Thread Dave
You will want to have both solr and a sql/nosql data storage option. They serve different purposes > On May 8, 2017, at 10:43 PM, bharath.mvkumar > wrote: > > Hi All, > > We have a use case where we have mysql database which stores documents and > also some of

Re: Filter Facet Query

2017-04-17 Thread Dave
Min.count is what you're looking for to get non 0 facets > On Apr 17, 2017, at 6:51 PM, Furkan KAMACI wrote: > > My query: > > /select?facet.field=research=on=content:test > > Q1) Facet returns research values with 0 counts which has a research value > that is not from

Re: Phrase Fields performance

2017-04-01 Thread Dave
Maybe commongrams could help this but it boils down to speed/quality/cheap. Choose two. Thanks > On Apr 1, 2017, at 10:28 AM, Shawn Heisey wrote: > >> On 3/31/2017 1:55 PM, David Hastings wrote: >> So I un-commented out the line, to enable it to go against 6 important >>

Re: Facet? Search problem

2017-03-13 Thread Dave
https://wiki.apache.org/solr/FieldCollapsing > On Mar 13, 2017, at 9:59 PM, Dave <hastings.recurs...@gmail.com> wrote: > > Perhaps look into grouping on that field. > >> On Mar 13, 2017, at 9:08 PM, Scott Smith <ssm...@mainstreamdata.com> wrote: >> >

Re: Facet? Search problem

2017-03-13 Thread Dave
Perhaps look into grouping on that field. > On Mar 13, 2017, at 9:08 PM, Scott Smith wrote: > > I'm trying to solve a search problem and wondering if facets (or something > else) might solve the problem. > > Let's assume I have a bunch of documents (100 million+).

Re: SOLR JOIN

2017-02-28 Thread Dave
That seems difficult if not impossible. The joins are just complex queries, with the same data set. > On Feb 28, 2017, at 11:37 PM, Nitin Kumar wrote: > > Hi, > > Can we use join query for more than 2 cores in solr. If yes, please provide > reference or example. >

Re: solr warning - filling logs

2017-02-26 Thread Dave
Can you please elaborate? > Sure, will try to move out to external zookeeper > >> On Sun, Feb 26, 2017 at 7:07 PM Dave <hastings.recurs...@gmail.com> wrote: >> >> You shouldn't use the embedded zookeeper with solr, it's just for >> development not anywher

Re: solr warning - filling logs

2017-02-26 Thread Dave
You shouldn't use the embedded zookeeper with solr, it's just for development not anywhere near worthy of being out in production. Otherwise it looks like you may have a port scanner running. In any case don't use the zk that comes with solr > On Feb 26, 2017, at 6:52 PM, Satya Marivada

Re: Question about best way to architect a Solr application with many data sources

2017-02-21 Thread Dave
; you could re-read and send to Solr. > > Best, > Erick > >> On Tue, Feb 21, 2017 at 5:17 PM, Dave <hastings.recurs...@gmail.com> wrote: >> B is a better option long term. Solr is meant for retrieving flat data, >> fast, not hierarchical. That's what a database

Re: Question about best way to architect a Solr application with many data sources

2017-02-21 Thread Dave
B is a better option long term. Solr is meant for retrieving flat data, fast, not hierarchical. That's what a database is for and trust me you would rather have a real database on the end point. Each tool has a purpose, solr can never replace a relational database, and a relational database

Re: Issues with Solr Morphline reading RFC822 files

2017-02-13 Thread Dave
Can't see what's color coded in the email. > On Feb 13, 2017, at 5:35 PM, Anatharaman, Srinatha (Contractor) > wrote: > > Hi, > > I am loading email files which are in RFC822 format into SolrCloud using Flume > But some meta data of the emails is not

Re: Solr Data Import Handler

2017-02-12 Thread Dave
That sounds pretty much like a hack. So if two imports happen at the same time they have to wait for each other? > On Feb 12, 2017, at 4:01 PM, Shawn Heisey wrote: > >> On 2/12/2017 10:30 AM, Minh wrote: >> Hi everyone, >> How can i run multithreads of DIH in a cluster for

Re: Configuring Solr for Maximum Concurrency

2016-12-29 Thread Dave Seltzer
fundamental issues in Solr's performance Or maybe I missed something stupid at the OS level. Sigh. Many thanks for all the help! -Dave On Wed, Dec 28, 2016 at 7:11 PM, Erick Erickson <erickerick...@gmail.com> wrote: > You'll see some lines with three different times in them, &q

Re: Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
cs] [Times: user=4.60 sys=0.00, real=1.24 secs] Is there something I should be grepping for in this enormous file? Many thanks! -Dave On Wed, Dec 28, 2016 at 12:44 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Threads are usually a container parameter I think. True, Solr wa

Re: Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
, 2016 at 12:42 PM, Pablo Anzorena <anzorena.f...@gmail.com> wrote: > Dave, > > there is something similar like MAX_CONNECTIONS and > MAX_CONNECTIONS_PER_HOST which control the number of connections. > > Are you leaving open the connection to zookeeper after you estab

Re: Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
are dead based on the fact that responses are so very sluggish. You've mentioned lots of timeouts, but are there any settings which control the number of available threads? Or is this something which is largely handled automagically? Many thanks! -Dave On Wed, Dec 28, 2016 at 11:56 AM, Erick Erickson

Configuring Solr for Maximum Concurrency

2016-12-28 Thread Dave Seltzer
! -Dave

Re: Cloud Behavior when using numShards=1

2016-12-27 Thread Dave Seltzer
Thanks Erick, That's pretty much where I'd landed on the issue. To me Solr Cloud is clearly the preferable option here - especially when it comes to indexing and cluster management. I'll give "preferLocalShards" a try and see what happens. Many thanks for your in-depth analysis! -

Re: Cloud Behavior when using numShards=1

2016-12-27 Thread Dave Seltzer
likely to be distributed in this fashion? -Dave q=_query_:"{!edismax mm=5}hashTable_0:359079936 hashTable_1:440999735 hashTable_2:1376147226 hashTable_3:35668745 hashTable_4:671810129 hashTable_5:536885545 hashTable_6:453337089 hashTable_7:1279281410 hashTable_8:772478009 hashTable_9:8060

Cloud Behavior when using numShards=1

2016-12-27 Thread Dave Seltzer
ERVER1 would be proxying requests to SERVER3 in a situation where the sf_fingerprints index is completely present on the local system. Is this a situation where I should be using generic replication rather than Cloud? Many thanks! -Dave

Cloud Behavior when using

2016-12-27 Thread Dave Seltzer
rather than Cloud? Dave Seltzer <dselt...@tveyes.com> Chief Systems Architect TVEyes (203) 254-3600 x222

Re: Poor Solr Cloud Query Performance against a Small Dataset

2016-11-03 Thread Dave Seltzer
Good tip Rick, I'll dig in and make sure everything is set up correctly. Thanks! -D Dave Seltzer <dselt...@tveyes.com> Chief Systems Architect TVEyes (203) 254-3600 x222 On Wed, Nov 2, 2016 at 9:05 PM, Rick Leir <rl...@leirtech.com> wrote: > Here is a wild guess. Whenever I

Poor Solr Cloud Query Performance against a Small Dataset

2016-11-01 Thread Dave Seltzer
this: subFingerprintId I've included some sample output below. I wasn't sure if this was a matter of changing the routing key in the collections system, or if this is a more fundamental problem with the way Term Frequencies are counted in a Solr Cloud environment. Many thanks! -Dave

RE: Performance of facet contain search in 5.2.1

2015-07-22 Thread Lo Dave
Yes. I am going to provide autocomplete with facet count as rank.i.e. when yours input owe a duty, the system will suggest xxx owe a duty yyy with highest count. Thanks. Dave Date: Wed, 22 Jul 2015 14:35:40 +0100 Subject: Re: Performance of facet contain search in 5.2.1 From: benedetti.ale

  1   2   3   >