Re: Getting to grips with auto-scaling

2020-06-09 Thread Tom Evans
ike it might be manually setup and managed collections and aliases for now. Cheers Tom On Mon, Jun 8, 2020 at 12:43 PM Radu Gheorghe wrote: > > Hi Tom, > > To your last two questions, I'd like to vent an alternative design: have > dedicated "hot" and "warm&quo

Indexing error when using Category Routed Alias

2020-06-09 Thread Tom Evans
uKill.tryProduce(EatWhatYouKill.java:171) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938) at java.base/java.lang.Thread.run(Unknown Source) 2020-06-09 02:12:58.507 INFO (qtp90045638-16) [c:products_20200609__CRA__NEW_CATEGORY_ROUTED_ALIAS_WAITING_FOR_DATA_TEMP s:shard1 r:core_node2 x:products_20200609__CRA__NEW_CATEGORY_ROUTED_ALIAS_WAITING_FOR_DATA_TEMP_shard1_replica_n1] o.a.s.c.S.Request [products_20200609__CRA__NEW_CATEGORY_ROUTED_ALIAS_WAITING_FOR_DATA_TEMP_shard1_replica_n1] webapp=/solr path=/update/json/docs params={} status=400 QTime=2422 Cheers Tom

Getting to grips with auto-scaling

2020-06-05 Thread Tom Evans
he colder shards? It seems to add a lot of complexity - should I just instead think that they aren't getting queried much, so won't be using up cache space that the hot shards will be using. Disk space is pretty cheap after all (total size for "items" + "lists" is under 60GB). Cheers Tom

Outdated information on JVM heap sizes in Solr 8.3 documentation?

2020-02-14 Thread Tom Burton-West
uction.html#memory-and-gc-settings; "...values between 10 and 20 gigabytes are not uncommon for production servers" Are "freeze the world" pauses still an issue with modern JVM's? Is it still advisable to avoid heap sizes over 2GB? Tom https://www.hathitrust.org/blogslarge-scale-search

mapping and tuning payloads in Solr 8

2020-02-12 Thread Burgmans, Tom
e the override the scorePayload method in WKSimilarity (it is removed from TFIDFSimilarity). I wonder what alternatives there are for mapping strings payload to floats and use them in a tunable formula for boosting. Thanks, Tom Burgmans

UAX29 URL Email Tokenizer not working as expected

2019-05-06 Thread Tom Van Cuyck
hen(s) are preserved." So I expect "ABC-123" to remain "ABC-123" However the term is split in 2 separate tokens "ABC" and "123". Same for "AB12-CD34" --> "AB12" and "CD34" etc... Is this behavior to be expected? Or is there a w

RE: Multiplicative Boosts broken since 7.3 (LUCENE-8099)

2019-02-13 Thread Burgmans, Tom
uot;: "4.0 = product of:\n 1.0 = boost\n 4.0 = product of:\n 1.0 = *:*\n4.0 = sum(float(price)=0.0,const(4))\n" } EXPLAIN and score are not consistent. Best regards Tom -Original Message- From: Tobias Ibounig [mailto:t.ibou...@netconomy.net] Sent: dinsdag 22 januari

Limit facet terms based on a substring using the JSON facet API

2019-01-29 Thread Tom Van Cuyck
can't find anything in the official documentation. Kind regards, Tom -- Would you like to receive our newsletter to stay updated? Please click here <http://eepurl.com/dwoymH> Tom Van Cuyck Software Engineer <http://www.ontoforce.com> ONTOFORCE WINNER of EY scale-up of the year 2018

Re: loadOnStartup=false doesn't appear to work for Solr 6.6

2018-08-17 Thread Tom Burton-West
quot;. So the problem is operator error. I didn't think about how the core selector actually sends a query to the core to get stats, which of course starts the core. Tom On Fri, Aug 17, 2018 at 12:18 PM, Erick Erickson wrote: > Tom: > > That hasn't been _intentionally_ changed. H

loadOnStartup=false doesn't appear to work for Solr 6.6

2018-08-17 Thread Tom Burton-West
to do/set? Tom

Re: Can the export handler be used with the edismax or dismax query handler

2018-07-29 Thread Tom Burton-West
only a few seconds. Tom On Sat, Jul 28, 2018 at 4:25 AM, Mikhail Khludnev wrote: > Tom, > Do you say you don't need rank results or you don't need to export score? > If the former is true, you can just put edismax to fq. > Just a note: using cursor mark with the score may cau

Re: Can the export handler be used with the edismax or dismax query handler

2018-07-27 Thread Tom Burton-West
query matched. Should I try to write some code to rewrite the logic of the edismax query with a complex boolean query or would it make sense for me to look at possibly modifying the export handler for my use case? Tom "q= _query_:"{!edismax qf='ocr^5+allfieldsProper^2+allfields^1+t

Can the export handler be used with the edismax or dismax query handler

2018-07-26 Thread Tom Burton-West
Hello all, I am completely new to the export handler. Can the export handler be used with the edismax or dismax query handler? I tried using local params : q= _query_:"{!edismax qf='ocr^5+allfields^1+titleProper^50' mm='100%25' tie='0.9' } art" which does not seem to be working. Tom

ExternalFileField management strategy with SolrCloud

2018-04-26 Thread Tom Peters
Is there a recommended way of managing external files with SolrCloud. At first glance it appears that I would need to manually manage the placement of the external_.txt file in each shard's data directory. Is there a better way of managing this (Solr API, interface, etc?) This message and any

Re: CDCR Bootstrap

2018-04-26 Thread Tom Peters
I'm not sure under what conditions it will be automatically triggered, but if you manually wanted to trigger a CDCR Bootstrap you need to issue the following query to the leader in your target data center. /solr//cdcr?action=BOOTSTRAP= The masterUrl will look something like (change the

Re: Does CDCR Bootstrap sync leaves replica's out of sync

2018-04-16 Thread Tom Peters
There are two ways I've gotten around this issue: 1. Add replicas in the target data center after CDCR bootstrapping has completed. -or- 2. After the bootstrapping has completed, restart the replica nodes one-at-time in the target data center (restart, wait for replica to catch up, then

Re: CDCR performance issues

2018-03-23 Thread Tom Peters
Thanks for responding. My responses are inline. > On Mar 23, 2018, at 8:16 AM, Amrit Sarkar <sarkaramr...@gmail.com> wrote: > > Hey Tom, > > I'm also having issue with replicas in the target data center. It will go >> from recovering to down. And when on

Re: CDCR performance issues

2018-03-12 Thread Tom Peters
I'm also having issue with replicas in the target data center. It will go from recovering to down. And when one of my replicas go to down in the target data center, CDCR will no longer send updates from the source to the target. > On Mar 12, 2018, at 9:24 AM, Tom Peters <tpet...@synac

Re: CDCR performance issues

2018-03-12 Thread Tom Peters
? Or is there something else we can do? Thanks. > On Mar 9, 2018, at 3:59 PM, Tom Peters <tpet...@synacor.com> wrote: > > Thanks. This was helpful. I did some tcpdumps and I'm noticing that the > requests to the target data center are not batched in any way. Each update > comes

Re: CDCR performance issues

2018-03-09 Thread Tom Peters
checkout paper > "Latency performance of SOAP Implementations". Same distribution of skills > - I knew TCP well, but Apache Axis 1.1 not so well. I still improved > response time of Apache Axis 1.1 by 250ms per call with 1-line of code. > > -Original Message

Re: CDCR performance issues

2018-03-08 Thread Tom Peters
of updates (3805 over two hours). Thanks. > On Mar 7, 2018, at 6:19 PM, Tom Peters <tpet...@synacor.com> wrote: > > I'm having issues with the target collection staying up-to-date with indexing > from the source collection using CDCR. > > This is what I'm gett

CDCR performance issues

2018-03-07 Thread Tom Peters
I'm having issues with the target collection staying up-to-date with indexing from the source collection using CDCR. This is what I'm getting back in terms of OPS: curl -s 'solr2-a:8080/solr/mycollection/cdcr?action=OPS' | jq . { "responseHeader": { "status": 0,

Re: Issues with CDCR in Solr 7.1

2018-03-05 Thread Tom Peters
You can ignore this. I think I found the issue (I was missing a block of XML in the source ocnfig). I'm going to monitor it over the next day and see if it was resolved. > On Mar 5, 2018, at 4:29 PM, Tom Peters <tpet...@synacor.com> wrote: > > I'm trying to get Solr CDCR se

Issues with CDCR in Solr 7.1

2018-03-05 Thread Tom Peters
I'm trying to get Solr CDCR setup in Solr 7.1 and I'm having issues post-bootstrap. I have about 5,572,933 documents in the source cluster (index size is 3.77 GB). I'm enabling CDCR in the following manner: 1. Delete the existing cluster in the target data center

Re: /var/solr/data has lots of index* directories

2018-03-05 Thread Tom Peters
ou can look inside the index.properties. The directory name mentioned in > that properties file is the one being used actively. The rest are old > directories that should be cleaned up on Solr restart but you can delete > them yourself without any issues. > > On Mon, Mar 5, 2

/var/solr/data has lots of index* directories

2018-03-05 Thread Tom Peters
While trying to debug an issue with CDCR, I noticed that the /var/solr/data directories on my source cluster have wildly different sizes. % for i in solr2-{a..e}; do echo -n "$i: "; ssh -A $i du -sh /var/solr/data; done solr2-a: 9.5G /var/solr/data solr2-b: 29G/var/solr/data

Is there a way to sort by conditional function in the Solr 7.2 JSON API?

2018-03-02 Thread Tom Van Cuyck
ed for sorting the buckets. I would like the buckets with no value for the numerical property to be sorted last. Is there a way to e.g. use conditional sorting? E.g. sort: "if(gt(unique,0),avg,-9) desc" I can't get this to work, while in the old API this appaers to be possible. Or is there another way to sort the buckets with a missing numeric value last? Kind regards, Tom

Re: Indexing timeout issues with SolrCloud 7.1

2018-03-01 Thread Tom Peters
b 24, 2018 at 1:37 AM, Deepak Goel <deic...@gmail.com> wrote: >> From the error list, i can see multiple errors: >> >> 1. Failure to recover replica >> 2. Peer sync error >> 3. Failure to download file >> >> On 24 Feb 2018 03:10, "Tom Peters"

Re: Indexing timeout issues with SolrCloud 7.1

2018-02-23 Thread Tom Peters
node > 'solr-2d' > > On 23 Feb 2018 09:42, "Tom Peters" <tpet...@synacor.com> wrote: > > I'm trying to debug why indexing in SolrCloud 7.1 is having so many issues. > It will hang most of the time, and timeout the rest. > > Here's an example: > >tim

Indexing timeout issues with SolrCloud 7.1

2018-02-22 Thread Tom Peters
I'm trying to debug why indexing in SolrCloud 7.1 is having so many issues. It will hang most of the time, and timeout the rest. Here's an example: time curl -s 'myhost:8080/solr/mycollection/update/json/docs' -d '{"solr_id":"test_001", "data_type":"test"}'|jq . {

Issues with refine parameter when subfaceting in a range facet

2018-01-24 Thread Tom Van Cuyck
l", "count": 4076 }, { "val": "Male", "count": 37 }, { "val": "Female", "count": 13 } ], "missing": { "count": 0 } } }, ... There is a factor 2 difference for each count in each bucket. If I perform the same queries with a larger range gap, e.g. \"start\":0.0, \"end\":55000.0, \"gap\":5000.0, there is no difference between the response with and without refine: true. Is this a known issue, or is there something we are overlooking? And is there information on whether or not this behavior will be the same in Solr 7? Kind regards, Tom

Re: Issue with CDCR bootstrapping in Solr 7.1

2017-12-04 Thread Tom Peters
, they will not receive the initial index. > On Dec 1, 2017, at 12:52 AM, Amrit Sarkar <sarkaramr...@gmail.com> wrote: > > Tom, > > (and take care not to restart the leader node otherwise it will replicate >> from one of the replicas which is missing the index). > > How is t

Re: Issue with CDCR bootstrapping in Solr 7.1

2017-11-30 Thread Tom Peters
the leader node otherwise it will replicate from one of the replicas which is missing the index). > On Nov 30, 2017, at 12:16 PM, Amrit Sarkar <sarkaramr...@gmail.com> wrote: > > Tom, > > This is very useful: > >> I found a way to get the follower replicas to receive th

Re: Issue with CDCR bootstrapping in Solr 7.1

2017-11-30 Thread Tom Peters
if this information helps at all. > On Nov 30, 2017, at 11:22 AM, Amrit Sarkar <sarkaramr...@gmail.com> wrote: > > Hi Tom, > > I see what you are saying and I too think this is a bug, but I will confirm > once on the code. Bootstrapping should happen on all the nodes of th

Issue with CDCR bootstrapping in Solr 7.1

2017-11-30 Thread Tom Peters
I'm running into an issue with the initial CDCR bootstrapping of an existing index. In short, after turning on CDCR only the leader replica in the target data center will have the documents replicated and it will not exist in any of the follower replicas in the target data center. All

Re: Data inconsistencies and updates in solrcloud

2017-11-21 Thread Tom Barber
Thanks Erick! As I said, user error! ;) Tom On 21/11/17 22:41, Erick Erickson wrote: I think you're confusing shards with replicas. numShards is 2, each with one replica. Therefore half of your docs will wind up on one replica and half on the other. If you're adding a single doc

Data inconsistencies and updates in solrcloud

2017-11-21 Thread Tom Barber
core and not the second. What are we likely to be doing wrong in our config or update to prevent the replication? Thanks Tom

Re: A problem of tracking the commits of Lucene using SHA num

2017-11-20 Thread TOM
Dear Shawn and Chris, Thanks very much for your replies and helps. And so sorry for my mistakes of first-time use of Mailing Lists. On 11/9/2017 5:13 PM, Shawn wrote: > Where did this information originate? My SHA data come from the paper On the Naturalness of Buggy Code(Baishakhi Ray, et al.

A problem of tracking the commits of Lucene using SHA num

2017-11-16 Thread TOM
Thanks for your patience and helps. Recently, I acquired a batch of commits?? SHA data of Lucene, of which the time span is from 2010 to 2015. In order to get original info, I tried to use these SHA data to track commits. First, I cloned Lucene repository to my local host, using the cmd

A problem of tracking the commits of Lucene using SHA num

2017-11-09 Thread TOM
Thanks for your patience and helps. Recently, I acquired a batch of commits?? SHA data of Lucene, of which the time span is from 2010 to 2015. In order to get original info, I tried to use these SHA data to track commits. First, I cloned Lucene repository to my local host, using the cmd

Re: Provide suggestion on indexing performance

2017-09-13 Thread Tom Evans
nly way to answer performance questions for your schema and data is to try it out. Generate 10 million docs, store them in a doc (eg as CSV), and then use the post tool to try different schema and query options. Cheers Tom

Re: Solr returning same object in different page

2017-09-13 Thread Tom Evans
imply by score)? Cheers Tom

Re: Get results in multiple orders (multiple boosts)

2017-08-18 Thread Tom Evans
alled usersortorder(), to which you would provide the users preferred sort ordering (which you would retrieve from wherever you store such information) and the field that you want sorted. It would look something like this: usersortorder("category_id", "3,5,1,7,2,12,14,58") DESC, usersortorder("source_id", "5,2,1,4,3") DESC, date DESC, title DESC Cheers Tom

Error in Solr 6.6 Example schemas re: DocValues for StrField type must be single-valued?

2017-08-15 Thread Tom Burton-West
/DocValuesType.html Is the comment in the example schema file completely wrong, or is there some issue with using a docValues with a multivalued StrField? Tom Burton-West https://www.hathitrust.org/blogslarge-scale-search

Re: setup solrcloud from scratch vie web-ui

2017-05-17 Thread Tom Evans
roblem, in your solrconfig.xml you have: data It should be ${solr.data.dir:} Which is still in your config, you've just got it commented out :) Cheers Tom

Re: to handle expired documents: collection alias or delete by id query

2017-03-24 Thread Tom Evans
te, you can expand the collection by adding replicas of that shard on other nodes - perhaps even removing it from the node that did the indexing. We have a node that solely does indexing, before the collection is queried for anything it is added to the querying nodes. You can do this manually, or you can automate it using the collections API. Cheers Tom

Re: Simulating group.facet for JSON facets, high mem usage w/ sorting on aggregation...

2017-02-10 Thread Tom Evans
better performance. Cheers Tom On Thu, Feb 9, 2017 at 11:58 AM, Bryant, Michael <michael.bry...@kcl.ac.uk> wrote: > Hi all, > > I'm converting my legacy facets to JSON facets and am seeing much better > performance, especially with high cardinality facet fields. However, the on

Re: Interval Facets with JSON

2017-02-10 Thread Tom Evans
On Wed, Feb 8, 2017 at 11:26 PM, deniz <denizdurmu...@gmail.com> wrote: > Tom Evans-2 wrote >> I don't think there is such a thing as an interval JSON facet. >> Whereabouts in the documentation are you seeing an "interval" as JSON >> facet type? >> >

Re: Interval Facets with JSON

2017-02-08 Thread Tom Evans
gap is fixed size. You can actually do your example however: json.facet={hieght_facet:{type:range, gap:20, start:160, end:190, hardend:True, field:height}} If you do require arbitrary bucket sizes, you will need to do it by specifying query facets instead, I believe. Cheers Tom

Re: Upgrade SOLR version - facets perfomance regression

2017-01-31 Thread Tom Evans
on.facet={name_of_facet_in_output:{type:terms, field:name_of_field}} It is documented in confluence: https://cwiki.apache.org/confluence/display/solr/Faceted+Search Also by yonik: http://yonik.com/json-facet-api/ Cheers Tom Cheers Tom

Re: Trouble boosting a field -solved-

2017-01-18 Thread Tom Chiverton
I 'solved' this by removing some of the 'AND' from my full query. AND should be optional but have no effect if there, right ? But for me it was forcing the score to 0. Which might be the same as saying nothing matched ? Tom On 13/01/17 15:10, Tom Chiverton wrote: I have a few hundred

Re: Concat Fields in JSON Facet

2017-01-17 Thread Tom Evans
e the match between the ID and Name. I don't understand what you mean. If you have these three documents in your index, what data do you want in the facet? [ {itemId: 1, itemName: "Apple"}, {itemId: 2, itemName: "Android"}, {itemId: 3, itemName: "Android"}, ] Cheers Tom

Re: Trouble boosting a field

2017-01-16 Thread Tom Chiverton
13 Jan 2017, at 16:35, Tom Chiverton <t...@extravision.com> wrote: Well, I've tried much larger values than 8, and it still doesn't seem to do the job ? For now, assume my users are searching for exact sub strings of a real title. Tom On 13/01/17 16:22, Walter Underwood wrote: I use

Re: Trouble boosting a field

2017-01-13 Thread Tom Chiverton
Well, I've tried much larger values than 8, and it still doesn't seem to do the job ? For now, assume my users are searching for exact sub strings of a real title. Tom On 13/01/17 16:22, Walter Underwood wrote: I use a boost of 8 for title with no boost on the content. Both Infoseek

Trouble boosting a field

2017-01-13 Thread Tom Chiverton
uot;defType": "dismax", "indent": "true", "qf": "title^2000 content", "pf": "pf=title^4000 content^2", "sort": "score desc", "wt": "json", but that was not better. if I remove con

Re: Has anyone used linode.com to run Solr | ??Best way to deliver PHP/Apache clients with Solr question

2016-12-15 Thread Tom Evans
ut empty, ready for you to assign new replicas to it using the Collections API. You can also use what are called "snitches" to define rules for how you want replicas/shards allocated amongst the nodes, eg to avoid placing all the replicas for a shard in the same rack. Cheers Tom [1] https://github.com/django-haystack/pysolr/commit/366f14d75d2de33884334ff7d00f6b19e04e8bbf

Re: Using DIH FileListEntityProcessor with SolrCloud

2016-12-06 Thread Tom Evans
> > > > > > This same script worked as expected on a single solr node (i.e. not in > SolrCloud mode). > > Thanks, > Chris > Hey Chris We hit the same problem moving from non-cloud to cloud, we had a collection that loaded its DIH config from various XML files listing the DB queries to run. We wrote a simple DataSource plugin function to load the config from Zookeeper instead of local disk to avoid having to distribute those config files around the cluster. https://issues.apache.org/jira/browse/SOLR-8557 Cheers Tom

Re: insert lat/lon from jpeg into solr

2016-12-01 Thread Tom Evans
n latlon , composite-latlon Cheers Tom

Re: Import from S3

2016-11-25 Thread Tom Evans
st tool: https://cwiki.apache.org/confluence/display/solr/Post+Tool Or by doing it manually however you wish: https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates Cheers Tom

Re: Query formulation help

2016-10-26 Thread Tom Evans
lear - do the maths before constructing the query! You might be able to do this with function queries, but why bother? If the number is fixed, then fix it in the query, if it varies then there must be some code executing on your client that can be used to do a simple addition. Cheers Tom

Re: Query formulation help

2016-10-26 Thread Tom Evans
On Wed, Oct 26, 2016 at 8:03 AM, Prasanna S. Dhakephalkar wrote: > Hi, > > > > May be very rudimentary question > > > > There is a integer field in a core : "cost" > > Need to build a query that will return documents where 0 < > "cost"-given_number < 500 >

Re: OOM Error

2016-10-26 Thread Tom Evans
e balancer. The user gets fed up at no response, so reloads the page, re-submitting the analysis and bringing down the next server in the cluster. Lather, rinse, repeat - and then you get to have a meeting to discuss why we invest so much in HA infrastructure that can be made non-HA by one user with a complex query. In those meetings it is much harder to justify not restarting. Cheers Tom

Re: indexing - offline

2016-10-20 Thread Tom Evans
or "foo" to "foo_2" 9) Remove "foo_1" collection once happy This avoids indexing overwhelming the performance of the cluster (or any nodes in the cluster that receive queries), and can be performed with zero downtime or config changes on the clients. Cheers Tom

Re: How to update from Solr Cloud 5.4.1 to 5.5.1

2016-08-29 Thread Tom Devel
, Tom On Sat, Aug 27, 2016 at 12:23 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 8/26/2016 10:22 AM, D'agostino Victor wrote: > > Do you know in which version index format changes and if I should > > update to a higher version ? > > In version 6.0, and again in

min()/max() on date fields using JSON facets

2016-07-25 Thread Tom Evans
s milliseconds since epoch? In UTC? Is there any way to control the output format or TZ? Is there any benefit in using JSON facets to determine this, or should I just continue using stats? Cheers Tom

RE: Reference to SolrCore from SearchComponent

2016-07-21 Thread Ellis, Tom (Financial Markets IT)
interface you can implement, which will provide access to the SolrCore. From there you can add a closeHook to the core. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Jul 21, 2016 at 2:34 PM, Ellis, Tom (Financial Markets IT) < tom.el...@lloydsbanking.com.invalid> wrote: > Hi There

Reference to SolrCore from SearchComponent

2016-07-21 Thread Ellis, Tom (Financial Markets IT)
the SearchComponent is instantiated in and adding a CloseHook or similar? Is this possible? Cheers, Tom Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in Scotland no. SC95000. Telephone: 0131 225 4555. Lloyds Bank plc. Registered Office: 25 Gresham

Re: Node not recovering, leader elections not occuring

2016-07-19 Thread Tom Evans
) This is with the "leader that is not the leader" shut down. Issuing a FORCELEADER via collections API doesn't in fact force a leader election to occur. Is there any other way to prompt Solr to have an election? Cheers Tom On Tue, Jul 19, 2016 at 5:10 PM, Tom Evans <tevans...@googlema

Re: Node not recovering, leader elections not occuring

2016-07-19 Thread Tom Evans
eader that is not the leader" server about 15-20 minutes ago, but we still have not had a leader election. Cheers Tom On Tue, Jul 19, 2016 at 4:30 PM, Erick Erickson <erickerick...@gmail.com> wrote: > How many replicas per Solr JVM? And do you > see any OOM errors when you bounce a ser

Node not recovering, leader elections not occuring

2016-07-19 Thread Tom Evans
;0" (and no other message) and kept the down node as the leader (!) Deleting the failed collection from the failed node and re-adding it has the same "Leader said I'm not the leader" error message. Any other ideas? Cheers Tom

Matching all terms in a multiValued field

2016-07-01 Thread Ellis, Tom (Financial Markets IT)
be able to see this document, but Bob should not. So if I am creating a query for Bob, how can I write it so that he can't see Document 1? I.e. how do I create a query that checks the multiValued field for 'confidential' but excludes documents that have anything else? Cheers, Tom Ellis

Strange highlighting on search

2016-06-16 Thread Tom Evans
TO *] AND -ingredient_tag_id:(35223) Is there any way I can make the query and highlighting work as expected as part of q? Is there any downside to putting the exclusion part in the fq in terms of performance? We don't use score at all for our results, we always order by other parameters. Cheers Tom Query

Re: result grouping in sharded index

2016-06-15 Thread Tom Evans
Do you have to group, or can you collapse instead? https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results Cheers Tom On Tue, Jun 14, 2016 at 4:57 PM, Jay Potharaju <jspothar...@gmail.com> wrote: > Any suggestions on how to handle result grouping in shar

Re: Import html data in mysql and map schemas using onlySolrCELL+TIKA+DIH [scottchu]

2016-05-24 Thread Tom Evans
ly informative. Start from the top page and browse away! https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide Handy to keep the glossary handy for any terms that you don't recognise: https://cwiki.apache.org/confluence/display/solr/Solr+Glossary Cheers Tom

Re: SolrCloud increase replication factor

2016-05-23 Thread Tom Evans
the rules on where replicas are created. The snitch is specified at collection creation time, or you can use MODIFYCOLLECTION to set it after the fact. See this wiki patch for details: https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement Cheers Tom

Re: Creating a collection with 1 shard gives a weird range

2016-05-17 Thread Tom Evans
s is as designed, see this email from Shawn: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201604.mbox/%3c570d0a03.5010...@elyograg.org%3E Cheers Tom

Re: changing web context and port for SolrCloud Zookeeper

2016-05-11 Thread Tom Gullo
That helps. I ended up updating the sole.in.sh file in /etc/default and that was in getting picked up. Thanks > On May 11, 2016, at 2:05 PM, Tom Gullo <tomgu...@gmail.com> wrote: > > My Solr installation is running on Tomcat on port 8080 with a web context > name that

Re: changing web context and port for SolrCloud Zookeeper

2016-05-11 Thread Tom Gullo
tting for the port > number (In Solr I mean) > > Am I on the right track or are you asking something other than how to get > Solr on host:8983/solr ? > > On Wed, May 11, 2016 at 11:56 AM, Tom Gullo <tomgu...@gmail.com> wrote: > >> I need to change the web context and t

changing web context and port for SolrCloud Zookeeper

2016-05-11 Thread Tom Gullo
I need to change the web context and the port for a SolrCloud installation. Example, change: host:8080/some-api-here/ to this: host:8983/solr/ Does anyone know how to do this with SolrCloud? There are values stored in clusterstate.json and /leader/elect and I could change them but that

Re: Indexing 700 docs per second

2016-04-19 Thread Tom Evans
oc) of the changed > records the better approach in this case. > > Could some one please share their views/ experience? Try it and see - everyone's data/schemas are different and can affect indexing speed. It certainly sounds achievable enough - presumably you can at least produce the documents at that rate? Cheers Tom

Re: Solr Support for BM25F

2016-04-18 Thread Tom Burton-West
0 different ones, even with different properties). - the same issue applies to length normalization, lucene has a "field length" but really no concept of document length." Tom On Thu, Apr 14, 2016 at 12:41 PM, David Cawley <david.cawl...@mail.dcu.ie> wrote: > Hello, &

Re: Verifying - SOLR Cloud replaces load balancer?

2016-04-18 Thread Tom Evans
ributed shards. Depending on your shard/cluster topology, this can increase performance if you are returning large amounts of data - many or large fields or many documents. Cheers Tom

Re: Anticipated Solr 5.5.1 release date

2016-04-15 Thread Tom Evans
Awesome, thanks :) On Fri, Apr 15, 2016 at 4:19 PM, Anshum Gupta <ans...@anshumgupta.net> wrote: > Hi Tom, > > I plan on getting a release candidate out for vote by Monday. If all goes > well, it'd be about a week from then for the official release. > > On Fri, Apr 15, 20

Anticipated Solr 5.5.1 release date

2016-04-15 Thread Tom Evans
to Solr 6, as we have only just finished validating 5.5.0 with our original queries! Cheers Tom

SolrCloud no leader for collection

2016-04-05 Thread Tom Evans
e and forcing a leader election also has no effect. Any ideas? The only viable option I see is to create a new collection, index it and then remove the old collection and alias it in. Cheers Tom

Re: Creating new cluster with existing config in zookeeper

2016-03-23 Thread Tom Evans
f the same cluster. Of course, you could think of a set of servers within a cluster as a "logical" cluster if it just serves particular collection, but "cluster" to me would be all of the servers within the same zookeeper tree, because that is where cluster state is maintained. Cheers Tom

Re: Re: Paging and cursorMark

2016-03-23 Thread Tom Evans
document which sorts higher than the supplied mark appears. Seems more complex, but maybe I'm not understanding the internals correctly. Fortunately for us, 90% of our users prefer infinite scroll, and 97% of them never go beyond page 3. Cheers Tom

Paging and cursorMark

2016-03-22 Thread Tom Evans
query for q=id:""=1=*. This seems to work, but means an extra Solr query for no real reason. Is there any other problem to doing this? Is there some other simple trick I am missing that we can use to get both the page of results we want and a nextCursorMark for the subsequent page? Cheers Tom

Re: Ping handler in SolrCloud mode

2016-03-19 Thread Tom Evans
On Wed, Mar 16, 2016 at 4:10 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 3/16/2016 8:14 AM, Tom Evans wrote: >> The problem occurs when we attempt to query a node to see if products >> or items is active on that node. The balancer (haproxy) requests the >> ping

Re: Ping handler in SolrCloud mode

2016-03-19 Thread Tom Evans
On Wed, Mar 16, 2016 at 2:14 PM, Tom Evans <tevans...@googlemail.com> wrote: > Hi all > > [ .. ] > > The option I'm trying now is to make two ping handler for skus that > join to one of items/products, which should fail on the servers which > do not sup

Ping handler in SolrCloud mode

2016-03-19 Thread Tom Evans
a little heavyweight for a status check to see whether we can direct requests at this server or not. Cheers Tom

mergeFactor/maxMergeDocs is deprecated

2016-03-03 Thread Tom Evans
sections of our solrconfig.xml files, and mergeFactor is not mentioned at all. > $ ack -B 1 -A 1 '<mergeFactor' lookups/conf/solrconfig.xml 210- > $ ack --all maxMergeDocs > $ Any ideas? Cheers Tom

Re: Separating cores from Solr home

2016-03-03 Thread Tom Evans
Hmm, I've worked around this by setting the directory where the indexes should live to be the actual solr home, and symlink the files from the current release in to that directory, but it feels icky. Any better ideas? Cheers Tom On Thu, Mar 3, 2016 at 11:12 AM, Tom Evans <tev

Separating cores from Solr home

2016-03-03 Thread Tom Evans
, as the core.properties for each shard is created inside the solr home. This is obviously no good, as when releasing a new version of the solr home, they will no longer be in the current solr home. Cheers Tom

Re: docValues error

2016-02-29 Thread Tom Evans
it is omitted. Cheers Tom

Re: Display entire string containing query string

2016-02-18 Thread Tom Running
"name = T" or maybe "name: T". Ultimately by searching for the string "name" I am trying to find the value of name. Thanks for your time. I appreciate your help -T On Feb 18, 2016 1:18 AM, "Binoy Dalal" <binoydala...@gmail.com> wrote: > Append = > &

Display entire string containing query string

2016-02-17 Thread Tom Running
Hello, I am working on a project using Solr to search data from retrieved from Nutch. I have successfully integrated Nutch with Solr, and Solr is able to search Nutch's data. However I am having a bit of a problem. If I query Solr, it will bring back the numfound and which document the query

Solr and Nutch integration

2016-02-16 Thread Tom Running
I am having problem configuring Solr to read Nutch data or Integrate with Nutch. Does anyone able to get SOLR 5.4.x to work with Nutch? I went through lot of google's article any still not able to get SOLR 5.4.1 to searching Nutch contents. Any howto or working configuration sample that you can

Re: Json faceting, aggregate numeric field by day?

2016-02-11 Thread Tom Evans
On Wed, Feb 10, 2016 at 12:13 PM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Hi Tom - thanks. But judging from the article and SOLR-6348 faceting stats > over ranges is not yet supported. More specifically, SOLR-6352 is what we > would need. > > [1]: https://i

Re: Json faceting, aggregate numeric field by day?

2016-02-10 Thread Tom Evans
an facet by day, and use the stats component to calculate the mean average. This blog post explains it: https://lucidworks.com/blog/2015/01/29/you-got-stats-in-my-facets/ Cheers Tom

  1   2   3   4   5   >