Re: SolrCloud exclusive features

2019-02-15 Thread David Hastings
>streaming expressions are only available in SolrCloud mode and not in Solr master-slave mode? yes, and its annoying as there are features of solr cloud I do not like. as far as a comprehensive list, that I do not know but would be interested in one as well On Thu, Feb 14, 2019 at 5:07 PM Arnold

Re: edismax: sorting on numeric fields

2019-02-14 Thread David Hastings
Not clearly understanding your question here. if your query is q=kind:animal weight:50 you will get no results, as nothing matches (assuming a q.op of AND) On Thu, Feb 14, 2019 at 4:06 PM Nicolas Paris wrote: > Hi > > I have a numeric field (say "weight") and I d'like to be able to get > resul

Re: Solr Index Size after reindex

2019-02-14 Thread David Hastings
The other thing I would be curious about is in your reindexing process, do you clear out the entire index before hand? if so perhaps there is content missing/moved On Thu, Feb 14, 2019 at 11:07 AM Erick Erickson wrote: > Basically, this is not possible ;). Therefore there's something I > don't

Re: Solr relevancy score different on replicated nodes

2019-01-29 Thread David Hastings
Maybe instead of using the solr score in your metrics, find a way to use the documents location in the results? you can never trust the score to be consistent, its constantly changing as the indexes changes On Tue, Jan 29, 2019 at 1:29 PM Ashish Bisht wrote: > Hi Erick, > > Our business wanted

Re: [SPAM] Re: Per-field slop param in eDisMax

2019-01-24 Thread David Hastings
Also the order matters, it would be a different result set than "a tnf"~2 On Thu, Jan 24, 2019 at 10:53 AM David Hastings < hastings.recurs...@gmail.com> wrote: > it allows two words or less to be matched in a phrase in-between "tnf" and > "a" > so it

Re: [SPAM] Re: Per-field slop param in eDisMax

2019-01-24 Thread David Hastings
it allows two words or less to be matched in a phrase in-between "tnf" and "a" so it will match "tnf a" "tnf aword1 a" "tnf aword1 aword2 a" On Thu, Jan 24, 2019 at 10:45 AM Danilo Tomasoni wrote: > And what does > > q: f2:"tnf α"~2 > > f.f2.qf: titles study_brief_title > > > means with

Re: PC hang while running Solr cloud instance?

2018-12-30 Thread David Hastings
1. Each pc? How many are you talking about? 2. Why are you using shards? On Dec 30, 2018, at 4:11 PM, John Milton mailto:johnmilton@gmail.com>> wrote: Wish you happy new year to you all. Hi, I had run my Solr cloud instance 7.5 on my Windows OS. It has 100 shards with 4 replication. My PC

shingles + stop words

2018-12-07 Thread David Hastings
Hey there, I have a field type defined as such: but whats happening is the shingles being returned are often times " nonstopword" with the space being defined as the filter token. I was hoping that the ManagedStopFilterFactory would have removed the stop words

Re: Case insensitive query for fetching facets

2018-12-07 Thread David Hastings
you could change your indexer to index the values to a dynamic field *_ci for each of the facets. ie, your facet is organization. index to the string field, and also index it to the dynamic organization_ci field but there will not be a short cut way of doing this in the schema itself On Fri, Dec

Re: solr crashes

2018-12-04 Thread David Hastings
you can set the -Xms value lower on startup but your still going to run into this issue. Really you just need to go buy more ram, hardware is cheap so you may as well max out the number of sockets for memory and get a couple TB sized SSD's. On Tue, Dec 4, 2018 at 10:47 AM Danilo Tomasoni wrote:

Re: Moving Solr index from Staging to Production

2018-11-28 Thread David Hastings
you just set up the solr install on the production server as a slave to your current install and hit the replicate button from the admin interface on the production server On Wed, Nov 28, 2018 at 1:34 PM Arunan Sugunakumar wrote: > Hi, > > I have deployed Solr 7.2 in a staging server in standalo

Re: Solr Cloud configuration

2018-11-20 Thread David Hastings
You're not the first one who had this idea ;). > > Whether those security issues are still valid is another question I > suppose. > > Best, > Erick > On Tue, Nov 20, 2018 at 11:01 AM David Hastings > wrote: > > > > Thanks, researching that now, but this

Re: Solr Cloud configuration

2018-11-20 Thread David Hastings
terstr. 14 > 48341 Altenberge > GERMANY > Tel.: (+49) 25 71 - 99 20 170 > Fax: (+49) 25 71 - 99 20 171 > > Umsatzsteuer ID DE259181123 > > Informieren Sie sich über unser gesamtes Leistungsspektrum unter > www.pure-host.de > Get our whole services at www.pure-host.de >

Solr Cloud configuration

2018-11-20 Thread David Hastings
I cant seem to find the documentation on how to actually edit the schema file myself, everything seems to lead me to using an API to add fields and stop words etc. this is more or less obnoxious, and the admin api for adding fields/field types is not exactly functional. is there a guide or someth

Re: Sort index by size

2018-11-19 Thread David Hastings
Also a full import, assuming the documents were already indexed, will just double your index size until a merge/optimize is ran since you are just marking a document as deleted, not taking back any space, and then adding another completely new document on top of it. On Mon, Nov 19, 2018 at 10:36 A

Re: Extracting important multi term phrases from the text

2018-11-16 Thread David Hastings
; > @David I am using SKG through the plugin. So it is a POST request with > query in body. I haven't yet upgraded to version 7.5. > > Thank you all for the help! > > Regards, > Pratik > > On Fri, Nov 16, 2018 at 8:36 AM David Hastings < > hastings.recurs...@

Re: Extracting important multi term phrases from the text

2018-11-16 Thread David Hastings
Which function of the SKG are you using? significantTerms? On Thu, Nov 15, 2018 at 7:09 PM Alexandre Rafalovitch wrote: > I think the underscore actually comes from the Shingles (parameter > fillerToken). Have you tried setting it to empty string? > > Regards, >Alex. > On Thu, 15 Nov 2018 a

Re: 3 Solr instances different ports

2018-11-15 Thread David Hastings
To add to the concerns above, running on the same machine, using the same disk, is going to be really detrimental to performance..but for a prototype its fine On Wed, Nov 14, 2018 at 4:10 PM Shawn Heisey wrote: > On 11/14/2018 7:58 AM, cristian.tiu...@gmail.com wrote: > > I want to have 3 differ

index size, stored vs indexed

2018-11-14 Thread David Hastings
Was wondering if anyone has an idea of the ratio size of indexed only vs stored and indexed in solr 7.x. I was gong to run some testing myself later today but was curious what others have seen in this regard. Thanks, David

Re: Indexing vs Search node

2018-11-09 Thread David Hastings
I personally like standalone solr for this reason, i can tune the indexing "master" for doing nothing but taking in documents and that way the slaves dont battle for resources in the process. On Fri, Nov 9, 2018 at 3:10 PM Erick Erickson wrote: > Fernando: > > I'd phrase it more strongly than Sh

Re: SolrCloud scaling/optimization for high request rate

2018-10-26 Thread David Hastings
Would adding the docValues in the schema, but not reindexing, cause errors? IE, only apply the doc values after the next reindex, but in the meantime keep functioning as there were none until then? On Fri, Oct 26, 2018 at 2:15 PM Toke Eskildsen wrote: > Sofiya Strochyk wrote: > > 5. Yes, docVa

Re: Score relevancy

2018-10-25 Thread David Hastings
is this RANK value stored as a float/integer? and whats the range? one idea is you could use edismax and have a really possible long boost query: RANK:[1 TO 2]^10 OR RANK:[3 TO 4]^9 but this isnt actually a great idea and gets sloppy fast. you could apply boost at index time, or a function query

Re: Solr 7.5/skg

2018-10-25 Thread David Hastings
Yup, thats the one. Thanks. On Thu, Oct 25, 2018 at 11:54 AM Alexandre Rafalovitch wrote: > Probably this one: https://issues.apache.org/jira/browse/SOLR-9418 > > I am not sure if that's documented yet. > > Regards, >Alex. > On Thu, 25 Oct 2018 at 11:0

Re: Solr 7.5/skg

2018-10-25 Thread David Hastings
Although another of Treys examples, the semantic query parser, Doesn't seem to have documentation unless im missing something? On Thu, Oct 25, 2018 at 10:41 AM David Hastings < hastings.recurs...@gmail.com> wrote: > Wow, thanks for that. Will do some research and come back with th

Re: Solr 7.5/skg

2018-10-25 Thread David Hastings
, slides 19+ > > But it is not a fully-supported usage, due to > https://issues.apache.org/jira/browse/SOLR-12569 . > > So, at your own risk. > > Regards, >Alex. > On Thu, 25 Oct 2018 at 10:32, David Hastings > wrote: > > > > Another skg question. t

Re: Solr 7.5/skg

2018-10-25 Thread David Hastings
wrote: > That's being worked on as well. We've migrated the documentation from > Confluence to standalone setup, so not all the pieces are in place > yet. > > Regards, >Alex. > On Thu, 25 Oct 2018 at 10:12, David Hastings > wrote: > > > > Than

Re: Solr 7.5/skg

2018-10-25 Thread David Hastings
s > > Or, as a second option, > > http://lucene.apache.org/solr/guide/7_5/stream-source-reference.html#significantterms > > Regards, >Alex. > On Thu, 25 Oct 2018 at 08:47, David Hastings > wrote: > > > > Hey all, I was going throught the Solr 7.5 docum

Solr 7.5/skg

2018-10-25 Thread David Hastings
Hey all, I was going throught the Solr 7.5 documentation: http://lucene.apache.org/solr/guide/7_5/index.html and it appears to be incomplete. last week Trey Grainger gave a presentation about the skg plugin, and said it was now included in the 7.5 distribution. There are no references to using i

Re: More Like This Query problems

2018-10-18 Thread David Hastings
Make sure your query has an “AND NOT id:your doc id” Also be certain there are other documents that will meet your criteria for a test case. Remember it’s unique words in your core/collection On Oct 18, 2018, at 2:43 PM, John Bickerstaff mailto:j...@johnbickerstaff.com>> wrote: All, I am havi

Re: Multiple Queries per request

2018-10-02 Thread David Hastings
perhaps you could do an OR query with the two requirements, and sort by an identifier that makes each result set unique from the other On Tue, Oct 2, 2018 at 11:05 AM Greenhorn Techie wrote: > Shamik, > > Wondering how to get this working? As I mentioned, my data is different for > each of the w

Re: to cloud or not to cloud

2018-09-26 Thread David Hastings
Agree with Walter. I personally really like the master slave set up for my use cases. David J. Hastings | Lead Developer dhasti...@wshein.com | 716.882.2600 x 176 William S. Hein & Co., Inc. 2350 North Forest Road | Getzville, NY 14068 www.wshein.com/contact-us

Re: solr, multiple ports

2018-09-12 Thread David Hastings
modified to use both servers On Sep 12, 2018, at 12:15 PM, Christopher Schultz mailto:ch...@christopherschultz.net>> wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 David, On 9/12/18 11:03 AM, David Hastings wrote: is there a way to start the default solr installation on more th

solr, multiple ports

2018-09-12 Thread David Hastings
is there a way to start the default solr installation on more than one port? Only thing I could find was adding another connector to Jetty, via https://stackoverflow.com/questions/6905098/how-to-configure-jetty-to-listen-to-multiple-ports however the default solr start command takes the -p parame

Re: Null Pointer Exception without details on Update in schemaless 7.4

2018-09-05 Thread David Hastings
Are you trying to push a java hash to solr? On Sep 5, 2018, at 10:32 PM, deniz mailto:denizdurmu...@gmail.com>> wrote: I have set up a schemaless solr (cloud) and have been trying to test the updates. as DIH is not going through field guessing, I have wrote a small piece of code to query data fr

edismax and booleans

2018-08-22 Thread David Hastings
having an issue where if i use edismax and search: (creator:(Michael Carrier)) it goes to the default operator of AND so gets results where creator has both those words in it. however when a boolean is present: *((pharmacy)) AND (creator:(Michael Carrier))* *it seems to revert to OR and do a *

Re: Section symbol, ignore in some queries but not others?

2018-07-25 Thread David Hastings
se but with a token gap, right? Like an eDisMax slop? > http://lucene.apache.org/solr/guide/7_4/the-extended-dismax- > query-parser.html > > Regards, > Alex. > > On 25 July 2018 at 14:47, David Hastings > wrote: > > Hey all. have a situation that seems pretty

Section symbol, ignore in some queries but not others?

2018-07-25 Thread David Hastings
Hey all. have a situation that seems pretty rough. currently in our data we have a lot of sentences like this: elements comprise the "stuff" of the tax. 3 Reg. § 1.901-2(a)(2). 4 Only non-Saudis are subject to the

Re: Solr 7 replication speed cap?

2018-07-23 Thread David Hastings
Actually this could be ignored, I think solr 5 used Mb in the admin interface and solr 7 is using MB, correct? On Mon, Jul 23, 2018 at 9:33 AM, David Hastings < hastings.recurs...@gmail.com> wrote: > Hey all, just set up a tradition solr slave to my indexing master > alongside a sol

Solr 7 replication speed cap?

2018-07-23 Thread David Hastings
Hey all, just set up a tradition solr slave to my indexing master alongside a solr 5 instance. on solr 5 we were getting about 100 MB/sec over our interface, and it would divide accordingly for how many slaves were replicating, ie 50 MB each if two slaves were replicating 33 for three so on a so f

Re: Document Count Difference Between Solr Versions 4.7 and 7.3

2018-07-19 Thread David Hastings
monitor the logging on the admin interface while indexing. also make sure to add a commit when done to get the docs in the collection before comparing the document counts On Thu, Jul 19, 2018 at 10:30 AM, THADC wrote: > Hi, > > I performed a bulk reindex against one of our larger databases for

Re: Sorting on ip address

2018-06-18 Thread David Hastings
sorry, I mean to an ip adress as a numeric value, example in MySQL: https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_inet-aton On Mon, Jun 18, 2018 at 12:35 PM, root23 wrote: > I am sorry i am not sure what you mean by store as atom. Is that an > fieldType > in solr

Re: Solr for Content Management

2018-06-07 Thread David Hastings
When you are sending updates you are adjusting the segments which take them out of memory and the index becomes "cold" until it gets enough searches to cache the various aspects of the index. On Thu, Jun 7, 2018 at 2:10 PM, Moenieb Davids wrote: > Hi All, > > Background: > I am currently testing

Re: Different solr score between stand alone vs cloud mode solr

2018-06-07 Thread David Hastings
Also the score is a fluid number, you shouldnt use the score for any real reason aside from seeing that the documents are in the right order in relation to the scores from the other documents in the result set. or the occasional condition where two results switch in place from one to the other bec

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings
sow=true made 7 mimic 5. On Wed, May 9, 2018 at 3:57 PM, Shawn Heisey wrote: > On 5/9/2018 1:25 PM, David Hastings wrote: > > https://pastebin.com/0QUseqrN > > > > here is mine for an example with the exact same behavior > > Can you try the query in the Analysis

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings
id rather not at least on my part, but in both cases i have: and text as my default field, changed from text_general On Wed, May 9, 2018 at 3:43 PM, Shawn Heisey wrote: > On 5/9/2018 1:25 PM, David Hastings wrote: > > https://pastebin.com/0QUseqrN > > Can you provide th

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings
https://pastebin.com/0QUseqrN here is mine for an example with the exact same behavior On Wed, May 9, 2018 at 3:14 PM, Shawn Heisey wrote: > On 5/9/2018 12:39 PM, Piyush Kumar Nayak wrote: > > we have recently upgraded from Solr5 to Solr7. I'm running into a change > of behavior that I cannot f

Re: change in the Standard Query Parser behavior when migrating from Solr 5 to 7.

2018-05-09 Thread David Hastings
Strange, I have the exact same results, whats more interesting is the analyzer shows identical for both 5 and 7, so its definetly a change in the LuceneQParser On Wed, May 9, 2018 at 2:39 PM, Piyush Kumar Nayak wrote: > we have recently upgraded from Solr5 to Solr7. I'm running into a change > o

Re: Solr OpenNLP named entity extraction

2018-04-17 Thread David Hastings
Did you send a commit after you sent the document? On Tue, Apr 17, 2018 at 8:23 AM, Alexey Ponomarenko wrote: > Hi once more I am trying to implement named entities extraction using this > manual > https://lucene.apache.org/solr/7_3_0//solr-analysis- > extras/org/apache/solr/update/processor/Ope

Re: How to use Tika (Solr Cell) to extract content from HTML document instead of Solr's MostlyPassthroughHtmlMapper ?

2018-04-10 Thread David Hastings
I actually used solr 5.x, the more like this features, and a subset of human tagged data (about 10%) to apply subject coding with around a 95% accuracy rate to over 2 million documents, so it is definitely doable On Tue, Apr 10, 2018 at 10:40 AM, Alexandre Rafalovitch wrote: > I know it was a jo

Re: solr 5.2->7.2, suggester failure

2018-04-03 Thread David Hastings
stack. > > https://issues.apache.org/jira/browse/LUCENE-7914 > > The commit being: > https://github.com/apache/lucene-solr/commit/ > 7dde798473d1a8640edafb41f28ad25d17f25a2d > > Kevin Risden > > On Tue, Apr 3, 2018 at 1:45 PM, David Hastings < > hastings.recurs..

Re: solr 5.2->7.2, suggester failure

2018-04-03 Thread David Hastings
ake many minutes. > > Best, > Erick > > On Tue, Apr 3, 2018 at 11:28 AM, David Hastings > wrote: > > Hey all, I recently got a 7.2 instance up and running, and it seems to be > > going well however, I have ran into this when creating one of my indexes, > > and was w

solr 5.2->7.2, suggester failure

2018-04-03 Thread David Hastings
Hey all, I recently got a 7.2 instance up and running, and it seems to be going well however, I have ran into this when creating one of my indexes, and was wondering if anyone had a quick idea right off the top of their head. solrconfig: fixspell FuzzyLookupFactory string

Re: What is creating certain fields?

2018-03-07 Thread David Hastings
those are dynamic fields. On Wed, Mar 7, 2018 at 12:43 AM, Keith Dopson wrote: > My default query produces this: > > | { > "id":"44419", > "date":["11/13/17 13:18"], > "url":["http://www.someurl.com";], > "title":["some title"], > "content":["some in

Re: [poll] which loadbalancer are you using for SolrCloud

2018-03-02 Thread David Hastings
Ill have to take a look at HAProxy. How much faster than nginx is it? To answer the question, I personally use nginx for load balancing/failovers and its been good, use the same nginx servers to load balance a Galera cluster as well. On Fri, Mar 2, 2018 at 11:09 AM, Shawn Heisey wrote: > On 3/

Re: storing large text fields in a database? (instead of inside index)

2018-02-20 Thread David Hastings
Really depends on what you consider too large, and why the size is a big issue, since most replication will go at about 100mg/second give or take, and replicating a 300GB index is only an hour or two. What i do for this purpose is store my text in a separate index altogether, and call on that core

Re: Index data from mysql DB to Solr - From Scratch

2018-02-17 Thread David Hastings
Your first step is to denormalize your data into a flat data structure. Then index that into your solr instance. Then you’re done On Feb 17, 2018, at 12:16 PM, @Nandan@ mailto:nandanpriyadarshi...@gmail.com>> wrote: Hi Team, I am working on one e-commerce project in which my data is storing int

Re: Index size increases disproportionately to size of added field when indexed=false

2018-02-13 Thread David Hastings
To piggy back on this, what would be the right scenarios to use docvalues='true'? On Tue, Feb 13, 2018 at 1:10 PM, Chris Hostetter wrote: > > : We are using Solr 7.1.0 to index a database of addresses. We have found > : that our index size increases massively when we add one extra field to > :

Re: Solr Replication being flaky (6.2.0)

2018-01-19 Thread David Hastings
This happens to me quite often as well. Generally on the replication admin screen it will say its downloading a file, but be at 0 or a VERY small kb/sec. Then after a restart of the slave its back to downloading at 30 to 100 mg/sec. Would be curious if there actually is a solution to this aside

Re: Deliver static html content via solr

2018-01-04 Thread David Hastings
Its really easy if find for people to start going down this road. Have to always remind myself of the hammer and nail analogy. Use each tool for its purpose. On Thu, Jan 4, 2018 at 11:27 AM, Walter Underwood wrote: > Why would you even consider putting static HTML in a search engine? You > don

Re: OOM spreads to other replica's/HA when OOM

2017-12-19 Thread David Hastings
We put nginx servers in front of our three solr stand alone servers and three node gallera cluster, it works very well and the amount of control it gives you is really helpful. On Tue, Dec 19, 2017 at 10:58 AM, Walter Underwood wrote: > > On Dec 19, 2017, at 7:38 AM, Toke Eskildsen wrote: > > >

Re: legacy replication

2017-12-15 Thread David Hastings
s been used for "full sync" in > > SolrCloud when peer sync can't be done. > > > > 2> The new TLOG and PULL replica types are a marriage of old-style > > master/slave and SolrCloud. In particular a PULL replica is > > essentially an old-style slave. A TL

legacy replication

2017-12-15 Thread David Hastings
So i dont step on the other thread, I want to be assured whether or not legacy master/slave/repeater replication will continue to be supported in future solr versions. our infrastructure is set up for this and all the HA redundancies that solrcloud provides we have already spend a lot of time and

Re: Any Insights SOLR Rank tuning tool

2017-12-13 Thread David Hastings
lucidworks fusion is not open source David J. Hastings | Lead Developer dhasti...@wshein.com | 716.882.2600 x 176 William S. Hein & Co., Inc. 2350 North Forest Road | Getzville, NY 14068 www.wshein.com/contact-us From: Doug Turnbull Sent: Wednesday, De

Re: highlight separator

2017-11-22 Thread David Hastings
Thanks, I kind of figured that was the case. On Wed, Nov 22, 2017 at 12:24 PM, Erick Erickson wrote: > I think that's only for the Unified Highlighter, which was introduced > to Lucene in 6.3 and Solr in 6.4. See: SOLR-9708 > > Best, > Erick > > On Wed, Nov 22, 2017

highlight separator

2017-11-22 Thread David Hastings
Im on solr 5.x at the moment, and am trying to get the highlighter to display complete sentences containing the match. setting: 'hl.method' => 'fastVector', 'hl.bs.type' =>'SENTENCE', hasnt been proving to work. is there a way for me to do it in the query itself? thanks -Dave

Semantic Knowledge Graph

2017-11-10 Thread David Hastings
Im looking through the slides from 2016 as well as the presentation again from 2017 and in them there is a user interface for this project, that i dont see as being available so im assuming it was created as a different project, would be nice to have access to that. also, all of the examples in t

App Studio

2017-11-01 Thread David Hastings
Hey all, at the conference it was mentioned that lucidworks would release app studio as its own and free project. is that still the case?

Re: Querying fields that don't exist in every collection

2017-10-18 Thread David Hastings
be more intuitive than returning results but with > different search logic. > > Currently we add placeholder fields to other collections in an alias to > get around this if required, but it's messy. > > -Original Message- > From: David Hastings [mailto:hastings.rec

Re: Querying fields that don't exist in every collection

2017-10-18 Thread David Hastings
I may be wrong here, but what i think is happening is the edismax parser sees a field that doesn't exist, and therefore "believes" all logic you entered into the query is a complete mistake and negates it as such. so NOT becomes the word not and * becomes whitespace. On Wed, Oct 18, 2017 at 3:15

Re: Semantic Knowledge Graph

2017-10-09 Thread David Hastings
s the one you're looking for : > > https://www.slideshare.net/treygrainger/leveraging- > lucenesolr-as-a-knowledge-graph-and-intent-engine > > -Atita > > On Mon, Oct 9, 2017 at 7:44 PM, David Hastings < > hastings.recurs...@gmail.com > > wrote: > > > Hey

Semantic Knowledge Graph

2017-10-09 Thread David Hastings
Hey All, slides form the 2017 lucene revolution were put up recently, but unfortunately, the one I have the most interest in, the semantic knowledge graph, have not been put up: https://lucenesolrrevolution2017.sched.com/event/BAwX/the-apache-solr-semantic-knowledge-graph?iframe=no&w=100%&sidebar=

Re: Solr Streaming Question

2017-09-19 Thread David Hastings
I am also curious about this, specifically about indexed/non stored fields. On Tue, Sep 19, 2017 at 3:33 PM, Webster Homer wrote: > Is it possible to use the streaming API to stream documents from a > collection and load them into a new collection? I was thinking that this > would be a great way

Re: Solrcloud configuration

2017-09-19 Thread David Hastings
Did you read the documentation on the schema and the DIH? On Tue, Sep 19, 2017 at 3:04 PM, Shashi Roushan wrote: > Hi All, > > I need your help to configure solrcloud with shards. > I have created collection and shards using solr6 and Zookeeper. Its working > fine. > My problems are: > Where I p

Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread David Hastings
erwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Sep 19, 2017, at 9:54 AM, David Hastings < > hastings.recurs...@gmail.com> wrote: > > > > Do you use HttpSolrClient then? > > > > On Tue, Sep 19, 2017 at 12:26 PM, Wal

Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread David Hastings
rwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Sep 19, 2017, at 9:05 AM, David Hastings < > hastings.recurs...@gmail.com> wrote: > > > > What about the ConcurrentUpdateSolrServer for solrj? That is what almost

Re: Using SOLR J 5.5.4 with SOLR 6.5

2017-09-19 Thread David Hastings
What about the ConcurrentUpdateSolrServer for solrj? That is what almost all of my indexing code is using for solr 5.x, Its been a while since I experimented with upgrading but i seem to remember having to go to HttpSolrClient and couldnt get the code to compile, so i tabled the experiment for a w

Re: Consecutive calls to a query give different results

2017-09-07 Thread David Hastings
"I am concerned that the same search gives different results after each search. The top document seems to cycle between 3 different documents" if you do debug query on the search, are the scores for the top 3 documents the same or not? you can easily have three documents with the same score, so

Re: query with wild card with AND taking lot of time

2017-08-31 Thread David Hastings
a field:* query always takes a long time, and should be avoided if at all possible. solr/lucene is still going to try to rank the documents based on that, even thought theres nothing to really rank. every single document where that field is not empty will have the same score for that part of the

Re: query with wild card with AND taking lot of time

2017-08-31 Thread David Hastings
> > 2) Because all your clauses are more like filters and are ANDed together, > you'll likely get better performance by putting them _each_ in an fq > E.g. > fq=product_identifier_type:DOTCOM_OFFER > fq=abstract_or_primary_product_id:[* TO *] why is this the case? is it just better to have no lo

Re: Solr index getting replaced instead of merged

2017-08-31 Thread David Hastings
>Can anyone tell is it possible to paginate the data using Solr UI? use the start/rows input fields using standard array start as 0, ie start=0, rows=10 start=10, rows=10 start=20, rows=10 On Thu, Aug 31, 2017 at 8:21 AM, Agrawal, Harshal (GE Digital) < harshal.agra...@ge.com> wrote: > Hello A

Re: Index relational database

2017-08-31 Thread David Hastings
when indexing a relational database its generally always best to denormalize it in a view or in your indexing code On Thu, Aug 31, 2017 at 3:54 AM, Renuka Srishti wrote: > Thanks Erick, Walter > But I think join query will reduce the performance. Denormalization will be > the better way than joi

Re: [bulk]: Re: Optimizing Dataimport from Oracle; cursor sharing; changing oracle session parameters

2017-08-15 Thread David Hastings
If you dont want to use your own Solj code, why not try many concurrent indexers that index different data sets. So run seven indexers each getting 500,000 rows at the exact same time perhaps. Its a hack, if it works, but if you have the machinery to do it, why not. or use the deltaquery, but i h

Re: Need help with query syntax

2017-08-10 Thread David Hastings
type:value AND (name:america^1+name:state^1+name:united^1) but in reality what you want to do is use the fq parameter with type:value On Thu, Aug 10, 2017 at 4:36 PM, OTH wrote: > Hello, > > I have the following use case: > > I have two fields (among others); one is 'name' and the other is 'typ

Re: MongoDb vs Solr

2017-08-04 Thread David Hastings
Also, id love to see an example of a many to many relationship in a nosql db as you described, since that's a rdbms concept. If it exists in a nosql environment I would like to learn how... > On Aug 4, 2017, at 10:56 PM, Dave wrote: > > Uhm. Dude are you drinking? > > 1. Lucidworks would neve

Re: mixed index with commongrams

2017-08-03 Thread David Hastings
erwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Aug 3, 2017, at 8:48 AM, David Hastings > wrote: > > > > Thanks, thats what i kind of expected. still debating whether the space > > increase is worth it, right now Im at .7% of searches tak

Re: mixed index with commongrams

2017-08-03 Thread David Hastings
ction and then use, say, collection > aliasing to make the switch atomically. > > Best, > Erick > > On Thu, Aug 3, 2017 at 8:07 AM, David Hastings > wrote: > > Hey all, I have yet to run an experiment to test this but was wondering > if > > anyone knows the answe

mixed index with commongrams

2017-08-03 Thread David Hastings
Hey all, I have yet to run an experiment to test this but was wondering if anyone knows the answer ahead of time. If i have an index built with documents before implementing the commongrams filter, then enable it, and start adding documents that have the filter/tokenizer applied, will searches that

Re: Arabic words search in solr

2017-08-02 Thread David Hastings
perhaps change your default operator to AND instead of OR if thats what you are expecting for a result On Wed, Aug 2, 2017 at 8:57 AM, mohanmca01 wrote: > Hi Phil Scadden, > > Thank you for your reply, > > we tried your suggested solution by removing hyphen while indexing, but it > was getting

Re: Disadvantages of having many cores

2017-07-28 Thread David Hastings
You're better off just using one core. Perhaps think about pre-processing the logs to "summarize" them into less "documents" I do this and in my situation i summarize things like, user-hits-item, for example. so i find all the times a certain user had hits on a certain item in one day and put tha

Re: Optimize stalls at the same point

2017-07-25 Thread David Hastings
> >> XX:+ParallelRefProcEnabled verbose:gc XX:+PrintHeapAtGC > XX:+PrintGCDetails > >> XX:+PrintGCDateStamps XX:+PrintGCTimeStamps > XX:+PrintTenuringDistribution > >> XX:+PrintGCApplicationStoppedTime Xloggc:/SSD2TB01/solr > >> 5.2.1/server/logs/solr_gc.log >

Re: Optimize stalls at the same point

2017-07-25 Thread David Hastings
; wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Jul 25, 2017, at 12:03 PM, David Hastings < > hastings.recurs...@gmail.com> wrote: > > > > I am trying to optimize a rather large index (417gb) because its sitting > at > > 28% deleti

Optimize stalls at the same point

2017-07-25 Thread David Hastings
I am trying to optimize a rather large index (417gb) because its sitting at 28% deletions. However when optimizing, it stops at exactly 492.24 GB every time. When I restart solr it will fall back down to 417 gb, and again, if i send an optimize command, the exact same 492.24 GB and it stops optim

Optimization/Merging space

2017-07-05 Thread David Hastings
Hi all, I am curious to know what happens when solr begins a merge/optimize operation, but then runs out of physical disk space. I havent had the chance to try this out yet but I was wondering if anyone knows what the underlying codes response to the situation would be if it happened. Thanks -Dav

Re: Not highlighting "and" and "or"?

2017-06-29 Thread David Hastings
Agreed. Stop words from the moment I started using them caused complaints and problems right off the bat. They may have been implemented less than a week before needing a re-index to fix all the problems they caused. On Thu, Jun 29, 2017 at 4:55 PM, Walter Underwood wrote: > Ultraseek (and Inf

Re: Number of occurrences in Solr Documents

2017-06-29 Thread David Hastings
I am using 5.2 and this works: select?q=*%3A*&wt=csv&indent=true&fl=totaltermfreq(text%2WORDIWANTTOFIND)&rows=1 On Thu, Jun 29, 2017 at 11:52 AM, Kaushik wrote: > Thanks to Susheel and Shawn. Unfortunately the Solr version we have is Solr > 5.3 and it does not include the totaltermfrequency fea

Re: How are people using the ICUTokenizer?

2017-06-20 Thread David Hastings
Have you successfully used the shingles with the MoreLikeThis query? Really curious about if this would to return the "interesting Phrases" On Tue, Jun 20, 2017 at 12:01 PM, Davis, Daniel (NIH/NLM) [C] < daniel.da...@nih.gov> wrote: > Joel, > > I think the issue is doing word-breaking according t

Re: Swapping indexes on disk

2017-06-14 Thread David Hastings
I dont have an answer to why the folder got cleared, however i am wondering why you arent using basic replication to do this exact same thing, since solr will natively take care of all this for you with no interruption to the user and no stop/start routines etc. On Wed, Jun 14, 2017 at 2:26 PM, Mi

Re: Score higher if multiple terms match

2017-06-08 Thread David Hastings
> > > >> >> Thanks. > >> >> Both of these are working in my case: > >> >> name:"tv promotion" --> name:"tv promotion" > >> >> name:tv AND name:promotion --> name:tv AND name:promotion > >> >

Re: Score higher if multiple terms match

2017-06-07 Thread David Hastings
sorry, i meant debug query where you would get output like this: "debug": { "rawquerystring": "name:tv promotion", "querystring": "name:tv promotion", "parsedquery": "+name:tv +text:promotion", On Wed, Jun 7,

Re: Score higher if multiple terms match

2017-06-07 Thread David Hastings
well, short answer, use the analyzer to see whats happening. long answer theres a difference between name:tv promotion --> name:tv default_field:promotion name:"tv promotion" --> name:"tv promotion" name:tv AND name:promotion --> name:tv AND name:promotion since your default field most lik

<    1   2   3   >