Re: Upgrading from 4.x to 5.x

2015-11-19 Thread Upayavira
On Thu, Nov 19, 2015, at 10:03 AM, Jan Høydahl wrote: > > Looking under solr5/server/solr I see configsets with the three default > > choices. What "feels" right is to make a new folder in there for my app > > (dovecot) and then copy my solr4/example/solr/collection1/conf folder. I'm > >

How to config security.json?

2015-11-19 Thread Byzen Ma
Hi, I'm not quite familar with security.json. I want to achieve these implementations. (1)Anyone who wants to do read/select/query action should be required passwd and username, namely authentication, no matter from Admin UI and solrj, especially from Admin UI! For I need to restrict strangers to

Parallel SQL / calcite adapter

2015-11-19 Thread Kai Gülzau
We are currently evaluating calcite as a SQL facade for different Data Sources - JDBC - REST >SOLR - ... I didn't found a "native" calcite adapter for solr (http://calcite.apache.org/docs/adapter.html). Is it a good idea to use the parallel sql feature

Re: Upgrading from 4.x to 5.x

2015-11-19 Thread Daniel Miller
Not quite but I'm improving. Or something... Looking under solr5/server/solr I see configsets with the three default choices. What "feels" right is to make a new folder in there for my app (dovecot) and then copy my solr4/example/solr/collection1/conf folder. I'm hoping I'm on the right track

Re: Upgrading from 4.x to 5.x

2015-11-19 Thread Daniel Miller
Thank you - but I still don't understand where to install/copy/modify config files or schema to point at my current index. My 4.x schema.xml was fairly well optimized, and I believe I removed any deprecated usage, so I assume it would be compatible with the 5.x server. Daniel On November

Re: Upgrading from 4.x to 5.x

2015-11-19 Thread Muhammad Zahid Iqbal
Hi daniel You need to update your config/scehma file on the path like '...\solr-dir\server\solr' . When you are done then you can update your index path in solrconfig.xml. I hope you got it. Best, Zahid On Thu, Nov 19, 2015 at 1:58 PM, Daniel Miller wrote: > Thank you -

Re: adding document with nested document require to set id

2015-11-19 Thread CrazyDiamond
if i add document without nesting then id is generated automatically(i use uuid), and this was working perfectly until i tryed to add nesting. i want the same behaviour for nested documents as it was for not nested. -- View this message in context:

Re: Upgrading from 4.x to 5.x

2015-11-19 Thread Jan Høydahl
> Looking under solr5/server/solr I see configsets with the three default > choices. What "feels" right is to make a new folder in there for my app > (dovecot) and then copy my solr4/example/solr/collection1/conf folder. I'm > hoping I'm on the right track - maybe working too hard. If you have

Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread KNitin
Great. Thanks! On Thu, Nov 19, 2015 at 11:24 AM, Sameer Maggon wrote: > If you are trying to create a large index and want speedups there, you > could use the MapReduceTool - > https://github.com/cloudera/search/tree/cdh5-1.0.0_5.2.1/search-mr. At a > high level, it

Re: adding document with nested document require to set id

2015-11-19 Thread Mikhail Khludnev
It should be explained http://wiki.apache.org/solr/UpdateRequestProcessor On Thu, Nov 19, 2015 at 9:27 PM, CrazyDiamond wrote: > How to do this? > > > > -- > View this message in context: >

RE: Expand Component Fields Response

2015-11-19 Thread Sanders, Marshall (AT - Atlanta)
Joel, Thanks for the response. I updated the JIRA with 2 new patches. One for trunk, and one for branches/branch_5x. It would be great if it could get reviewed and make it in before 5.4 if it meets approval. https://issues.apache.org/jira/browse/SOLR-8306 Thanks, Marshall Sanders

Re: Error in log after upgrading Solr

2015-11-19 Thread Shawn Heisey
On 11/18/2015 3:29 PM, Shawn Heisey wrote: > I'll see if I can put together a minimal configuration to reproduce. The really obvious idea here was to start with the example server, built from the same source I used for the real thing, and create a core based on sample_techproducts_configs.

Re: replica recovery

2015-11-19 Thread Brian Scholl
Hey Erick, Thanks for the reply. I plan on rebuilding my cluster soon with more nodes so that the index size (including tlogs) is under 50% of the available disk at a minimum, ideally we will shoot for under 33% budget permitting. I think I now understand the problem that managing this

Re: adding document with nested document require to set id

2015-11-19 Thread CrazyDiamond
How to do this? -- View this message in context: http://lucene.472066.n3.nabble.com/adding-document-with-nested-document-require-to-set-id-tp4240908p4241091.html Sent from the Solr - User mailing list archive at Nabble.com.

Generating Index offline and loading into solrcloud

2015-11-19 Thread KNitin
Hi, I was wondering if there are existing tools that will generate solr index offline (in solrcloud mode) that can be later on loaded into solrcloud, before I decide to implement my own. I found some tools that do only solr based index loading (non-zk mode). Is there one with zk mode enabled?

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I note that the thread called "Security Problems" (most recent post by Nobel Paul) seems like it may help with much of what I'm trying to do. I will see to what extent that may help.

Re: adding document with nested document require to set id

2015-11-19 Thread Mikhail Khludnev
Hi, Perhaps you want UUIDUpdateProcessorFactory loop through SolrInputDocument.getChildDocuments() and assign generated value. You need to implement an own update processor (by extending one of existing). On Thu, Nov 19, 2015 at 7:41 PM, CrazyDiamond wrote: > How exactly

Re: replica recovery

2015-11-19 Thread Erick Erickson
bq: I would still like to increase the number of transaction logs retained so that shard recovery (outside of long term failures) is faster than replicating the entire shard from the leader That's legitimate, but (you knew that was coming!) nodes having to recover _should_ be a rare event. Is

RE: Boost non stemmed keywords (KStem filter)

2015-11-19 Thread Markus Jelsma
Hello Jan - i have no code i can show but we are using it to power our search servers. You are correct, you need to deal with payloads at query time as well. This means you need a custom similarity but also customize your query parser to rewrite queries to payload supported types. This is also

Large multivalued field and overseer problem

2015-11-19 Thread Olivier
Hi, We have a Solrcloud cluster with 3 nodes (4 processors, 24 Gb RAM per node). We have 3 shards per node and the replication factor is 3. We host 3 collections, the biggest is about 40K documents only. The most important thing is a multivalued field with about 200K to 300K values per document

Re: Parallel SQL / calcite adapter

2015-11-19 Thread Joel Bernstein
It's an interesting question. The JDBC driver is still very basic. It would depend on how much of the JDBC spec needs to be implemented to connect to Calcite/Drill. Joel Bernstein http://joelsolr.blogspot.com/ On Thu, Nov 19, 2015 at 3:28 AM, Kai Gülzau wrote: > > We are

Re: replica recovery

2015-11-19 Thread Jeff Wartes
I completely agree with the other comments on this thread with regard to needing more disk space asap, but I thought I’d add a few comments regarding the specific questions here. If your goal is to prevent full recovery requests, you only need to cover the duration you expect a replica to be

Number of fields in qf & fq

2015-11-19 Thread Steven White
Hi everyone What is considered too many fields for qf and fq? On average I will have 1500 fields in qf and 100 in fq (all of which are OR'ed). Assuming I can (I have to check with the design) for qf, if I cut it down to 1 field, will I see noticeable performance improvement? It will take a lot

Re: Number of fields in qf & fq

2015-11-19 Thread Steven White
Thanks Walter. I see your point. Does this apply to fq as will? Also, how does one go about debugging performance issues in Solr to find out where time is mostly spent? Steve On Thu, Nov 19, 2015 at 6:54 PM, Walter Underwood wrote: > With one field in qf for a

Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread KNitin
Ah got it. Another generic question, is there too much of a difference between generating files in map reduce and loading into solrcloud vs using solr NRT api? Has any one run any test of that sort? Thanks a ton, Nitin On Thu, Nov 19, 2015 at 3:00 PM, Erick Erickson

Re: Number of fields in qf & fq

2015-11-19 Thread Walter Underwood
With one field in qf for a single-term query, Solr is fetching one posting list. With 1500 fields, it is fetching 1500 posting lists. It could easily be 1500 times slower. It might be even slower than that, because we can’t guarantee that: a) every algorithm in Solr is linear, b) that all

Re: Number of fields in qf & fq

2015-11-19 Thread Walter Underwood
The implementation for fq has changed from 4.x to 5.x, so I’ll let someone else answer that in detail. In 4.x, the result of each filter query can be cached. After that, they are quite fast. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Nov 19,

Re: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread Anshum Gupta
I'll try out what you did later in the day, as soon as I get time but why exactly are you creating cores manually? Seems like you manually create a core and the try to add a replica. Can you try using the Collections API to create a collection? Starting Solr 5.0, the only supported way to create

Re: replica recovery

2015-11-19 Thread Brian Scholl
Primarily our outages are caused by Java crashes or really long GC pauses, in short not all of our developers have a good sense of what types of queries are unsafe if abused (for example, cursorMark or start=). Honestly, stability of the JVM is another task I have coming up. I agree that

Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread Sameer Maggon
If you are trying to create a large index and want speedups there, you could use the MapReduceTool - https://github.com/cloudera/search/tree/cdh5-1.0.0_5.2.1/search-mr. At a high level, it takes your files (csv, json, etc) as input can create either a single or a sharded index that you can either

Re: Error in log after upgrading Solr

2015-11-19 Thread Chris Hostetter
: on sample_techproducts_configs. Because removing newSearcher and : firstSearcher fixed the problem for me, the next step was to configure : similar newSearcher and firstSearcher queries to what I used to have in : my config, and try indexing docs. : : I did this, and the problem did not

Re: Large multivalued field and overseer problem

2015-11-19 Thread Anshum Gupta
Hi Olivier, A few things that you should know: 1. The Overseer is at a per cluster level and not at a per-collection level. 2. Also, documents/fields/etc. should have zero impact on the Overseer itself. So, while the upgrade to a more recent Solr version comes with a lot of good stuff, the

Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread Erick Erickson
Note two things: 1> this is running on Hadoop 2> it is part of the standard Solr release as MapReduceIndexerTool, look in the contribs... If you're trying to do this yourself, you must be very careful to index docs to the correct shard then merge the correct shards. MRIT does this all

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I tried again with the following security.json, but the results were the same: { "authentication":{ "class":"solr.BasicAuthPlugin", "credentials":{ "solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c=",

Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread KNitin
Thanks, Eric. Looks like MRIT uses Embedded solr running per mapper/reducer and uses that to index documents. Is that the recommended model? Can we use raw lucene libraries to generate index and then load them into solrcloud? (Barring the complexities for indexing into right shard and merging

Re: Boost non stemmed keywords (KStem filter)

2015-11-19 Thread Ahmet Arslan
Hi, I wonder about using two fields (text_stem and text_no_stem) and applying query time boost text_stem^0.3 text_no_stem^0.6 What is the advantage of keyword repeat/paylad approach compared with this one? Ahmet On Thursday, November 19, 2015 10:24 PM, Markus Jelsma

Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread Erick Erickson
Sure, you can use Lucene to create indexes for shards if (and only if) you deal with the routing issues About updates: I'm not talking about atomic updates at all. The usual model for Solr is if you have a unique key defined, new versions of documents replace old versions of documents based

Re: Error in log after upgrading Solr

2015-11-19 Thread Shawn Heisey
On 11/19/2015 2:10 PM, Chris Hostetter wrote: > when you indexed docs into this test config, did you use waitSearcher=true > like in your original logs? > > I think that + newSearcher QuerySendListener is the key to triggering the > error logging. I don't recall ever configuring anything with

Re: Error in log after upgrading Solr

2015-11-19 Thread Shawn Heisey
On 11/19/2015 3:02 PM, Shawn Heisey wrote: > On 11/19/2015 2:10 PM, Chris Hostetter wrote: >> when you indexed docs into this test config, did you use waitSearcher=true >> like in your original logs? >> >> I think that + newSearcher QuerySendListener is the key to triggering the >> error

Re: replica recovery

2015-11-19 Thread Erick Erickson
Right, I've managed to double the memory required by Solr by varying the _query_. Siiih. There are some JIRAs out there (don't have them readily available, sorry) that short-circuit queries that take "too long", and there are some others to short circuit "expensive" queries. I believe this

Re: RealTimeGetHandler doesn't retrieve documents

2015-11-19 Thread Jack Krupansky
Do the failing IDs have any special characters that might need to be escaped? Can you find the documents using a normal query on the unique key field? -- Jack Krupansky On Thu, Nov 19, 2015 at 10:27 AM, Jérémie MONSINJON < jeremie.monsin...@gmail.com> wrote: > Hello everyone ! > > I'm using

Re: Large multivalued field and overseer problem

2015-11-19 Thread Erick Erickson
In addition to Anshum's excellent points: bq: And after a short period of time, all the cluster is unavailable (out of memory JVM error). This is where I'd focus my efforts. I suspect your memory-bound and are actually seeing OOM errors about the time this problem manifests itself. Or you're

Re: Boost non stemmed keywords (KStem filter)

2015-11-19 Thread Walter Underwood
That is the approach I’ve been using for years. Simple and effective. It probably makes the index bigger. Make sure that only one of the fields is stored, because the stored text will be exactly the same in both. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my

solr indexing warning

2015-11-19 Thread Midas A
Getting following log on solr PERFORMANCE WARNING: Overlapping onDeckSearchers=2`

Json Facet api on nested doc

2015-11-19 Thread xavi jmlucjav
Hi, I am trying to get some faceting with the json facet api on nested doc, but I am having issues. Solr 5.3.1. This query gest the buckets numbers ok: curl http://shost:8983/solr/collection1/query -d 'q=*:*=0& json.facet={ yearly-salaries : {

Re: adding document with nested document require to set id

2015-11-19 Thread Mikhail Khludnev
Hello, On Thu, Nov 19, 2015 at 12:48 PM, CrazyDiamond wrote: > id is generated automatically(i use > uuid) > How exactly you are doing that? i tryed to add nesting. i want > the same behaviour for nested documents as it was for not nested. > How exactly you want it to

Re: Security Problems

2015-11-19 Thread Jan Høydahl
Would it not be less surprising if ALL requests to Solr required authentication once an AuthenticationPlugin was enabled? Then, if no AuthorizationPlugin was active, all authenticated users could do anything. But if AuthorizationPlugin was configured, you could only do what your role allows you

Re: Upgrading from 4.x to 5.x

2015-11-19 Thread Muhammad Zahid Iqbal
Daniel, You are close, delete those *configsets* folder and paste you *collection1 *folder and run the server. It will do the trick. On Thu, Nov 19, 2015 at 2:54 PM, Daniel Miller wrote: > Not quite but I'm improving. Or something... > > Looking under solr5/server/solr I see

Re: Number of fields in qf & fq

2015-11-19 Thread Erick Erickson
An fq is still a single entry in your filterCache so from that perspective it's the same. And to create that entry, you're still using all the underlying fields to search, so they have to be loaded just like they would be in a q clause. But really, the fundamental question here is why your

答复: Security Problems

2015-11-19 Thread Byzen Ma
Thanks for the reply. The two smallest rules 1) "name":"all-admin", "collection": null, "path":"/*" "role:"somerole" 2) all core handlers "name":"all-core-handlers", "path":"/*" "role":"somerole" do work after I reset my security.json. But another magic things happened. After I accidentally

Re: replica recovery

2015-11-19 Thread Erick Erickson
First, every time you autocommit there _should_ be a new tlog created. A hard commit truncates the tlog by design. My guess (not based on knowing the code) is that Real Time Get needs file handle open to the tlog files and you'll have a bunch of them. Lots and lots and lots. Thus the too many

RE: Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
Thank you for the reply. What we are attempting is to require a password for practically everything, so that even were a hacker to get within the firewall, they would have limited access to the various services (the Security team even complained that, for Solr 4.5 servers, attempts to access

Re: Generating Index offline and loading into solrcloud

2015-11-19 Thread Erick Erickson
Apples/Oranges question: They're different beasts. The NRT stuff (spark-solr for example, Cloudera's Flume sink as well, custom SolrJ clients, whatever) is constrained by the number of Solr servers you have running, more specifically the number of shards. When you're feeding docs fast enough that

Re: solr indexing warning

2015-11-19 Thread Midas A
Thanks Emir , So what we need to do to resolve this issue . This is my solr configuration. what changes should i do to avoid the warning . ~abhishek On Thu, Nov 19, 2015 at 6:37 PM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > This means that one searcher is still warming

答复: Security Problems

2015-11-19 Thread Byzen Ma
Apology for I did't read thread " replica recovery " carefully. It may be another problem. But the thread " Implementing security.json is breaking ADDREPLICA " is same as me. -邮件原件- 发件人: solr-user-return-118173-mabaizhang=126@lucene.apache.org

Re: solr indexing warning

2015-11-19 Thread Shawn Heisey
On 11/19/2015 11:06 PM, Midas A wrote: > autowarmCount="1000"/> size="1000" initialSize="1000" autowarmCount="1000"/> ="1000" autowarmCount="1000"/> Your caches are quite large. More importantly, your autowarmCount is very large. How many documents are in each of your cores? If you check

Re: solr indexing warning

2015-11-19 Thread Midas A
thanks Shawn, As we are this server as a master server there are no queries running on it . in that case should i remove these configuration from config file . Total Docs: 40 0 Stats # Document cache : lookups:823 hits:4 hitratio:0.00 inserts:820 evictions:0 size:820 warmupTime:0

RealTimeGetHandler doesn't retrieve documents

2015-11-19 Thread Jérémie MONSINJON
Hello everyone ! I'm using SolR 5.3.1 with solrj.SolrClient. My index is sliced in 3 shards, each on different server. (No replica on dev platform) It has been up to date for a few days... [image: Images intégrées 2] I'm trying to use the RealTimeGetHandler to get documents by their Id. In our

Re: Solr Keyword query on a specific field.

2015-11-19 Thread Aaron Gibbons
I apologize for the long delay in response. I was able to get it to work tho! Thank you!! The local parameters were confusing to me at first. I'm using SolrNet to build the search which has LocalParams that I am already specifying, but those are not applied to the title portion. What I ended up

Re: adding document with nested document require to set id

2015-11-19 Thread CrazyDiamond
How exactly you are doing that? Doing what? this is from schema. id ... this is from config i want to store in nested document multiple values that should be grouped together, like pages ids and pages urls -- View this message in context:

Re: Stem Words Highlighted - Keyword Not Highlighted

2015-11-19 Thread Ann B
Thank you Jack. The field I was passing to Solr actually uses the following: Tokenizer: StandardTokenizerFactory Filters: StopFilterFactory LengthFilterFactory LowerCaseFilterFactory RemoveDuplicatesTokenFilterFactory Once I passed in the correct field that uses the white space tokenizer and

Re: Boost non stemmed keywords (KStem filter)

2015-11-19 Thread Jan Høydahl
Do you have a concept code for this? Don’t you also have to hack your query parser, e.g. dismax, to use other Query objects supporting payloads? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 18. nov. 2015 kl. 22.24 skrev Markus Jelsma :

Re: solr indexing warning

2015-11-19 Thread Emir Arnautovic
This means that one searcher is still warming when other searcher created due to commit with openSearcher=true. This can be due to frequent commits of searcher warmup taking too long. Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support *

Re: replica recovery

2015-11-19 Thread Brian Scholl
I have opted to modify the number and size of transaction logs that I keep to resolve the original issue I described. In so doing I think I have created a new problem, feedback is appreciated. Here are the new updateLog settings: ${solr.ulog.dir:}

Re:Re: Implementing security.json is breaking ADDREPLICA

2015-11-19 Thread 马柏樟
Hi Anshum, I encounter the same problem after I config my security.json like this: { "authentication":{ "class":"solr.BasicAuthPlugin", "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}}, "authorization":{

Re: Security Problems

2015-11-19 Thread Noble Paul
What is the smallest possible security.json required currently to protect all possible paths (except those served by Jetty)? You would need 2 rules 1) "name":"all-admin", "collection": null, "path":"/*" "role:"somerole" 2) all core handlers "name":"all-core-handlers", "path":"/*"