DrillSideWaysSearch on faceting

2017-03-07 Thread Chitra
Hi, I am a new one to Solr. Recently we are digging drill sideways search (for faceting purpose) on Lucene. Is that solr facets support drill sideways search like Lucene?? If yes, Kindly suggest the API or article how to use. Any help is much appreciated. Thanks, Chitra

Re: Managed schema vs schema.xml

2017-03-07 Thread Erick Erickson
I suggest we make additional comments on SOLR-10241. I created it as a result of this discussion and anyone who takes it on would benefit from the comments being made there. Anyone can make comments there, there's no special karma required although you do have to create a login. From the

RE: https

2017-03-07 Thread Phil Scadden
>The first advise is NOT to expose your Solr directly to the public. >Anyone that can hit /search, can also hit /update and wipe out your index. I would second that too. We have never exposed Solr and I also sanitise queries in the proxy. Notice: This email and any attachments are confidential

Re: Managed schema vs schema.xml

2017-03-07 Thread Alexandre Rafalovitch
Actually, the main cross-references are from the solrconfig.xml, and primarily from the Update Request Handler chain that creates the "schemaless" effect. Then, I think you also have highlighters, etc. I did that full analysis as a presentation at the last Solr Revolution:

Re: Managed schema vs schema.xml

2017-03-07 Thread Shawn Heisey
On 3/7/2017 1:32 PM, Phil Scadden wrote: I would have to say the "basic-config" seems distinctly more than basic. It is still a huge file. I thought perhaps I could delete every unused field type, but worried there were some "system" dependencies. This is definitely true. Solr example

Re: https

2017-03-07 Thread Shawn Heisey
On 3/7/2017 1:45 PM, pubdiverses wrote: I would like to acces my solr instance with https://domain.com/solr. how to do this ? The reference guide covers this. https://cwiki.apache.org/confluence/display/solr/Enabling+SSL If you want to change the port to 443 so it will work without a port

Re: https

2017-03-07 Thread Alexandre Rafalovitch
The first advise is NOT to expose your Solr directly to the public. Anyone that can hit /search, can also hit /update and wipe out your index. Unless you run a proper proxy that secures URLs and sanitizes the parameters (in GET, in POST, escaped, etc). And if you are doing that, you can setup

Re: Managed schema vs schema.xml

2017-03-07 Thread Walter Underwood
Maybe this is expert stuff, but we keep our schema, solrconfig, and everything else checked into source control. I wrote a Python thingy to hit the cluster through the load balancer, get the zkHost string from status, upload the files to zookeeper (kazoo is a nice library), link the config,

Re: Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-07 Thread Erick Erickson
OK, you can do kind of the same thing with the core admin API "SWAP" command. And in stand-alone it's much simpler. Just index your data somewhere (I don't particularly care where, your workstation, a spare machine lying around, whatever) and copy the result to the index directory for prod. I'd

Re: Managed schema vs schema.xml

2017-03-07 Thread OTH
In the reference guide, in the chapter named "The Well Configured Solr Instance", it says (I'm copying+pasting from the PDF version) : Switching from Managed Schema to Manually Edited schema.xml > If you have started Solr with managed schema enabled and you would like to > switch to manually

https

2017-03-07 Thread pubdiverses
Hello, I would like to acces my solr instance with https://domain.com/solr. how to do this ?

Re: Managed schema vs schema.xml

2017-03-07 Thread Erick Erickson
See SOLR-10241 I just opened for discussion. My first impulse (well actually second) is to _not_ encourage anyone to hand-edit managed schema, and especially not put that in the ref guide. But perhaps put the classic schema factory in a comment in basic_configs and direct people there (and maybe

RE: Managed schema vs schema.xml

2017-03-07 Thread Phil Scadden
I would second that guide could be clearer on that. I read and reread several times trying to get my head around the schema.xml/managed-schema bit. I came away from first cursory reading with the idea that managed-schema was mostly for schema-less mode and only after some stuff ups and puzzling

Re: Managed schema vs schema.xml

2017-03-07 Thread Alexandre Rafalovitch
On 7 March 2017 at 15:02, OTH wrote: > Specifically, that 'managed-schema' could indeed be modified by hand, or > even that what the HTTP API is doing is actually modifying this file. Thank you for the specific feedback. That is something we should fold into the Guide as

Re: Managed schema vs schema.xml

2017-03-07 Thread OTH
Hi, Thanks, I should've consulted this guide more thoroughly. I actually had encountered this section when reading the guide, but somehow forgot about it when asking this question. I think, it doesn't clarify some things very well, which could leave a beginner a bit confused. Specifically,

Re: query rewriting

2017-03-07 Thread Tim Casey
Hendrik, I would recommend attempting to stick to the query syntax, as it is in lucene, as close as possible. However, if you do your own query parse build up, you can use a Lucene Query object. I don't know where this bolts into solr, exactly. But I have done this extensively with lucene.

Re: I want to contribute custom made NLP based solr filters but dont know how.

2017-03-07 Thread Erik Hatcher
Nice use of the VelocityResponseWriter :) (and looks like, at quick glance, several other goodies under there too) Erik > On Mar 5, 2017, at 7:40 AM, Avtar Singh Mehra wrote: > > Hello everyone, > I have developed project called WiseOwl which is basically a fact

RE: Solrcloud after restore collection, when index new documents into restored collection, leader not write to index.

2017-03-07 Thread Marquiss, John
Thanks, I have done that... for those following this on the mail list or coming across this in the archives the JIRA is SOLR-10242 https://issues.apache.org/jira/browse/SOLR-10242 Cores created by Solr RESTORE end up with stale searches after indexing. Also, we do not see any warnings or

[ANNOUNCE] Apache Solr 6.4.2 released

2017-03-07 Thread Ishan Chattopadhyaya
7 March 2017, Apache Solr 6.4.2 available Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search and analytics, rich document parsing, geospatial search, extensive

Re: question related to solr LTR plugin

2017-03-07 Thread Michael Nilsson
Hey Saurabh, So there are a few things you can do to with the LTR plugin and your Solr collection to solve different degrees of the kind of personalization you might want. I'll start with the simplest, which isn't exactly what you're looking for but is very quick to implement and play around

Re: Managed schema vs schema.xml

2017-03-07 Thread Alexandre Rafalovitch
Yes, it has been asked many times and has been answered both on the list and in the - awesome - Reference Guide. I'd recommend reading that and then coming back again with more specific question: https://cwiki.apache.org/confluence/display/solr/Overview+of+Documents%2C+Fields%2C+and+Schema+Design

Re: Managed schema vs schema.xml

2017-03-07 Thread OTH
Hi, Thanks, that sufficiently answers the question. It's especially good to know now that hand-editing is fine, as long as it's separated from API calls with restarts in between. Thanks On Tue, Mar 7, 2017 at 9:57 PM, Shawn Heisey wrote: > On 3/7/2017 9:41 AM, OTH wrote:

Re: Managed schema vs schema.xml

2017-03-07 Thread Ivan Bianchi
Hi OTH, I personally prefer to use the classic *schema.xml* file as I feel its better for core creation with the desired fields than dealing with api calls. You can use it specifying the schemaFactory class as ClassicIndexSchemaFactory as follows: Best regards, Ivan 2017-03-07 17:41

Re: Tokenized querying

2017-03-07 Thread OTH
Hi, Thanks a lot for the help. Adding 'score' to 'fl' worked. I had been using Lucene for some time (thought not at an expert level), and I was usually pretty satisfied with the scoring; so I'm assuming Solr should work fine for me too. At the time being I'm just trying to get a handle on how

RE: Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-07 Thread Pouliot, Scott
We are NOT using SOLRCloud yet. I'm still trying to figure out how to get SOLRCloud running. We're using old school master/slave replication still. So sounds like it can be done if I get to that point. I've got a few non SOLR tasks to get done today, so hoping to dig into this later in the

Re: Managed schema vs schema.xml

2017-03-07 Thread Shawn Heisey
On 3/7/2017 9:41 AM, OTH wrote: > I understand that managed-schema is not supposed to be edited by hand but > only via the "API". All I understand about this "API" however, is that it > may be referring to the "Schema" page in the Solr browser-based Admin. > > However, in this "Schema" page, it

Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Elodie Sannier
Thank you Alex for your answer. The reference on deleted files are only on index files (with .fdt, .doc. dvd, ... extensions). sudo lsof | grep DEL java 1366kookel DEL REG 253,8 15360013 /opt/kookel/data/searchSolrNode/solrindex/fr1_green/index/_2508z.cfs java

Re: Tokenized querying

2017-03-07 Thread Alexandre Rafalovitch
Try adding "score" as a pseudo-field in the 'fl' parameter: https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl(FieldList)Parameter You can also enable debug and debug.explain.structured, if you want to go all inception on figuring the scores out:

Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Elodie Sannier
Thank you Erick for your answer. The files are deleted even without JVM restart but they are still seen as DELETED by the kernel. We have a custom code and for the migration to Solr 6.4.0 we have added a new code with req.getSearcher() but without "close". We will decrement the reference count

Managed schema vs schema.xml

2017-03-07 Thread OTH
Hello I'm sure this has been asked many times but I'm having some confusion here. I understand that managed-schema is not supposed to be edited by hand but only via the "API". All I understand about this "API" however, is that it may be referring to the "Schema" page in the Solr browser-based

Re: Tokenized querying

2017-03-07 Thread OTH
Hello, Thanks for your response; it turned out the fields were indeed of 'string' type, and when I changed them to 'text_general', it started to work as I wanted. However, I'm still not sure how to extract the scores? I don't seem to be getting that in the response. Much thanks On Tue, Mar 7,

Re: Solrcloud after restore collection, when index new documents into restored collection, leader not write to index.

2017-03-07 Thread Erick Erickson
John: Just skimming, but this certainly seems like it merits a JIRA, please feel free to create one (you may have to create your own logon first). Please include the steps for the test you did where new replicas "see" the restored index. And this last where you hand edited things is important.

Re: Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-07 Thread Erick Erickson
First, it's not clear whether you're using SolrCloud or not, so there may be some irrelevant info in here bq: .could I do it on another instance running the same SOLR version (4.8.0) and then copy the database into place instead In a word "yes", if you're careful. Assuming you have more than

RE: Solrcloud after restore collection, when index new documents into restored collection, leader not write to index.

2017-03-07 Thread Marquiss, John
Just another bit of information supporting the thought that this has to recycling the searcher when there is a change to the index directory that is named something other than "index". Running our tests again, this time after restring the content I shut down solr and renamed the two

Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Alexandre Rafalovitch
More sanity checks: what are the extensions/types of the files that are not deleted? If they are index files, optimize command (even if no longer recommended for production) should really blow all the old ones away. So, are they other kinds of files? Regards, Alex.

Re: Solr Update If Record Exists ?

2017-03-07 Thread Alexandre Rafalovitch
Try adding: _version_:1, as per Optimistic Concurrency feature: https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents#UpdatingPartsofDocuments-OptimisticConcurrency This should work for a single document update. If you are trying to update multiple document, this is

Re: Tokenized querying

2017-03-07 Thread Alexandre Rafalovitch
The default text field definition (text_general) tokenizes on spaces, so - if I understand the question correctly - it should just work. Are you by any chance searching against name field that is defined as String (and is not tokenized). If you do Solr tutorial, you search on "ipod", which seems

RE: Getting an error: was indexed without position data; cannot run PhraseQuery

2017-03-07 Thread Pouliot, Scott
Welcome to IT right? We're always in some sort of pickle ;-) I'm going to play with settings on one of our internal environments and see if I can replicate the issue and go from there with some test fixes. Here's a question though... If I need to re-indexcould I do it on another

Re: [Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Erick Erickson
Just as a sanity check, if you restart the Solr JVM, do the files disappear from disk? Do you have any custom code anywhere in this chain? If so, do you open any searchers but fail to close them? Although why 6.4 would manifest the problem but other code wouldn't is a mystery, just another sanity

[Migration Solr5 to Solr6] Unwanted deleted files references

2017-03-07 Thread Elodie Sannier
Hello, We have migrated from Solr 5.4.1 to Solr 6.4.0 and the disk usage has increased. We found hundreds of references to deleted index files being held by solr. Before the migration, we had 15-30% of disk space used, after the migration we have 60-90% of disk space used. We are using Solr

Tokenized querying

2017-03-07 Thread OTH
Hello, I am new to Solr. I am using v. 6.4.1. I have what is probably a pretty simple question. Let's say I have these documents with the following values in a single field (let's call it "name"): sando...@company.example.com sandb...@company.example.com sa...@company.example.com Sancho

Re: I want to contribute custom made NLP based solr filters but dont know how.

2017-03-07 Thread Joel Bernstein
Yes, I think Apache OpenNLP should be fine. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Mar 7, 2017 at 8:09 AM, Avtar Singh Mehra wrote: > Well i have created some filters using Apache OpenNLP. Will it work? > > On 6 March 2017 at 00:30, Joel Bernstein

Re: I want to contribute custom made NLP based solr filters but dont know how.

2017-03-07 Thread Avtar Singh Mehra
Well i have created some filters using Apache OpenNLP. Will it work? On 6 March 2017 at 00:30, Joel Bernstein wrote: > I believe StanfordCore is licensed under the GPL which means it will be > incompatible with the Apache License. Would it be possible to port to a >

Re: Learning to rank - Bad Request

2017-03-07 Thread Vincent
Update: solved, I missed some config steps. Thanks for the help, Vincent On 07-03-17 12:20, Vincent wrote: Hi Christine, Thanks for the reply! I suppose something in our config doens't comply with the LTR plugin. If I browse to http://[HOST]:[PORT]/solr/[COLLECTION]/schema/feature-store,

Solr Update If Record Exists ?

2017-03-07 Thread ~$alpha`
*SOLR_URL/update -d \' [ {"id" : "1", "ONLINE" : {"set":"1"} } ]'* I am using solr6.3. Above command works fine as it updates online flag to 1 for id=1. But the issue is if the record is not present then it adds a value as id=1 and online=1 which is not desired. So question is, is it

How to enable Gzip compression in Solr v6.1.0 with Jetty 9.3.8.v20160314

2017-03-07 Thread Gul Imran
Hi I am trying to upgrade Solr from v5.3 to v6.1.0 which comes with Jetty  9.3.8.v20160314.  However, after the upgrade we seem to have lost Gzip compression capability since we still have the old configuration.  When I send the following request with the appropriate headers, I do not get a

Re: Learning to rank - Bad Request

2017-03-07 Thread Vincent
Hi Christine, Thanks for the reply! I suppose something in our config doens't comply with the LTR plugin. If I browse to http://[HOST]:[PORT]/solr/[COLLECTION]/schema/feature-store, where I upload the features to, the browser can't find the page: *Not Found* No REST managed resource

Re: Solr Query Suggestion

2017-03-07 Thread vrindavda
Hi Emir,Grouping is exactly what I wanted to achieve. Thanks !!Thank you,Vrinda Davda -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Query-Suggestion-tp4323180p4323743.html Sent from the Solr - User mailing list archive at Nabble.com.