Re: Removing words like "FONT-SIZE: 9pt; FONT-FAMILY: arial" from content

2019-01-11 Thread Zheng Lin Edwin Yeo
Thanks for your reply. What I have found is that in the EML file, there are 2 Content-Type, one is text/html, and the other is text/plain. The text/html will words like "*FONT-SIZE: 9pt; FONT-FAMILY: arial*" in the content, but for the text/plain, there is no such words, and the content is clean

Re: SOLR v7 Security Issues Caused Denial of Use - Sonatype Application Composition Report

2019-01-11 Thread Bob Hathaway
Hi Shawn, Thanks for the great answers. Thanks also to Jörn Franke and Gus Heck for responses. The images were sent for convenience of the issues listed below them. We are working to get infosec approval. It would be helpful to put the security links prominently on the solr splash and

Re: what are the best client interface ?

2019-01-11 Thread markus kalkbrenner
The latest module versions of Drupal and Typo3 now both use the solarium library. I think solarium is the most used PHP library for Solr and it is the most active project. But as one of the maintainers of the Drupal integration and the solarium library itself, my opinion might not be totally

Re: 6.3 -> 6.4 Sorting responseWriter renamed

2019-01-11 Thread Raveendra Yerraguntla
Hi Joel, Thanks for the quick response. Our current usage is below. Could you guide me in using the new class and write method.  public class customSearchHandler extends SearchHandler {@Override public void inform(SolrCore core) {    super.inform(core); …. 

Log4j Configuration

2019-01-11 Thread deathbycaramel
Hi, I'm running solr v6.6.5 using a pretty generic log4j properties file: # Default Solr log4j config # rootLogger log level may be programmatically overridden by -Dsolr.log.level solr.log=${solr.log.dir} log4j.rootLogger=INFO, file, CONSOLE # Console appender will be programmatically disabled

Re: 6.3 -> 6.4 Sorting responseWriter renamed

2019-01-11 Thread Joel Bernstein
The functionality should be exactly the same. The config files though need to be changed. I would recommend adding any custom configs that you have to the new configs following the ExportWriter changes. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Jan 10, 2019 at 11:21 AM Raveendra

RE: what are the best client interface ?

2019-01-11 Thread Davis, Daniel (NIH/NLM) [C]
WordPress and Drupal both have ways to interface with Solr through plugins/modules. Not sure that describes your PHP website. I like Ruby on Rails "projectblacklight" for an easy and usable discovery layer. We are a Python/Django shop - we've had good luck with Django-haystack and pysolr. >

what are the best client interface ?

2019-01-11 Thread said
I want to integrate my *Solr* search engine with my *PHP* website and I hesitate over doing interface with *Velocity UI *or with *Solarium* ? what do you think about ? Thank you for help. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Schema.xml, copyField, Slash, ignoreCase ?

2019-01-11 Thread Steve Rowe
Hi Bruno, ignoreCase: Looks like you already have achieved this? auto truncation: This is caused by inclusion of PorterStemFilterFactory in your "text_en" field type. If you don't want its effects (i.e. treating different forms of the same word interchangeably), remove the filter. process

Re: REBALANCELEADERS is not reliable

2019-01-11 Thread Erick Erickson
bq: You have to check if the cores, participating in leadership election, are _really_ in sync. And this must be done before starting any rebalance. Sounds ugly... :-( This _should_ not be necessary. I'll add parenthetically that leader election has been extensively re-worked in Solr 7.3+ though

Re: Delayed/waiting requests

2019-01-11 Thread Erick Erickson
Jimi's comment is one of the very common culprits. Autowarming is another. Are you indexing at the same time? If so it could well be you aren't autowarming and the spikes are caused by using a new IndexSearcher that has to read much of the index off disk when commits happen. The "smoking gun"

Re: Schema.xml, copyField, Slash, ignoreCase ?

2019-01-11 Thread Erick Erickson
The admin UI>>(select a core)>>analysis page is your friend here. It'll show you exactly what each filter in your analysis chain does and from there you'll need to mix and match filters, your tokenizer and the like to support the use-cases you need. My guess is that the field type you're using

Re: Single query to get the count for all individual collections

2019-01-11 Thread Zheng Lin Edwin Yeo
Thanks for the reply. I have tried out on adding a new field to contains the collection id, and use json facet query to get the count. This is working. Regards, Edwin On Thu, 10 Jan 2019 at 23:33, Hullegård, Jimi < jimi.hulleg...@svensktnaringsliv.se> wrote: > Unless someone else has a cleaver

Re: Solr relevancy score different on replicated nodes

2019-01-11 Thread Erick Erickson
What Elizabeth said. Really, this is an intractable problem. Even in the TLOG and PULL replica case, an index getting updates will still fire their replication requests at different wall-clock time. Even if that were coordinated, the vagaries of networks etc. would _still_ mean the various

Need help on Solr authorization

2019-01-11 Thread sathish kumar
Hi, We have a two node Solr setup(version is 7.2.1) with embedded zookeeper running in Solr Server 1. We have recently enabled SSL and also enabled basic authentication and RuleBasedAuthorizationPlugin. As part of testing, created new user with admin role and assigned the permissions

Re: Solr relevancy score different on replicated nodes

2019-01-11 Thread Elizabeth Haubert
Hello, To a certain extent, I agree with Eric, that this isn't a problem, but looks like one. The nature of TF*IDF is such that you will see different scores for the same query over time on the same replica, or different replicas for the same query with most replication schemes. This is mildly

Bugs with Re-ranking/LtR and ExplainAugmenterFactory

2019-01-11 Thread Sambhav Kothari (BLOOMBERG/ LONDON)
Hello, Currently, if we use the ExplainAugmenterFactory with LtR, instead of using the model/re-rankers explain method, it uses the default query explain (tf-idf explanation). This happens because the BasicResultContext doesn't wrap the

Schema.xml, copyField, Slash, ignoreCase ?

2019-01-11 Thread Bruno Mannina
Hello, I’m facing a problem concerning the default field “text” (SOLR 5.4) and queries which contains / (slash) I need to have default “text” field with: - ignoreCase, - no auto truncation, - process slash char I would like to perform only query on the field “text” Queries can

RE: Delayed/waiting requests

2019-01-11 Thread Gael Jourdan-Weil
Interesting indeed, we did not see anything with VisualVM but having a look at the GC logs could gives us more info, especially on the pauses. I will collect data over the week-end and look at it. Thanks De : Hullegård, Jimi Envoyé : vendredi 11 janvier 2019

Re: REBALANCELEADERS is not reliable

2019-01-11 Thread Bernd Fehling
Hi Erik, yes, I would be happy to test any patches. Good news, I got rebalance working. After running the rebalance about 50 times with debugger and watching the behavior of my problem shard and its core_nodes within my test cloud I came to the point of failure. I solved it and now it works.

Re: Solr relevancy score different on replicated nodes

2019-01-11 Thread Ashish Bisht
Hi Erick, Your statement "*At best, I've seen UIs where they display, say, 1 to 5 stars that are just showing the percentile that the particular doc had _relative to the max score*" is something we are trying to achieve,but we are dealing in percentages rather stars(ratings) Change in MaxScore