Re: Best practice to support multi-tenant with Solr

2014-03-15 Thread Lajos
Hi Shushuai, Just a few thoughts. I would guess that most people would argue for implementing multi-tenancy within your core (via some unique filter ID) or collection (via document routing) because of the headache of managing individual cores at the scale you are talking about. There are

Re: Best practice to support multi-tenant with Solr

2014-03-15 Thread shushuai zhu
Lajos,   Thanks a lot for your insighful thoughts.   - The most important thing you have to lock down is whether there is a need to customize the schema/solrconfig for each tenant. If there is, then having individual cores per tenant is going to be a stronger

Re: How to write nested schema.xml in solr

2014-03-15 Thread Jack Krupansky
That's for nested documents, not nested fields. Nested documents is a technique for grouping a block of related documents along with a parent document and then being able to join the parent and a child. Solr has no support for nested fields - you need to flatten your fields, such as

Solr Japanese support

2014-03-15 Thread Bala Iyer
Hi, I am new to Solr japanese. I added the support for japanese on schema.xml  How can i insert Japanese text into that field either by solr client (java / php / ruby ) or by curl schema.xml     field name=username type=string indexed=true stored=true

example schema now stores most field values

2014-03-15 Thread Michael Sokolov
While upgrading from 4.2.1 to 4.6.1 I noticed that many of the fields defined in the example schema.xml that used to be indexed and not stored are now defined as indexed and stored. Is there anything behind this change other than the idea that it would be more convenient to have all the

Re: example schema now stores most field values

2014-03-15 Thread Yonik Seeley
Perhaps so atomic updates work? -Yonik http://heliosearch.org - solve Solr GC pauses with off-heap filters and fieldcache On Sat, Mar 15, 2014 at 1:02 PM, Michael Sokolov msoko...@safaribooksonline.com wrote: While upgrading from 4.2.1 to 4.6.1 I noticed that many of the fields defined in the

Re: Best practice to support multi-tenant with Solr

2014-03-15 Thread Lajos
Hi Shushuai, --- Finally, I would (in general) argue for cloud-based implementations to give you data redundancy ... --- Do you mean using multi-sharding to have multiple replicas of cores (corresponding to tenants) across nodes? Shushuai

[solr 4.7.0] analysis page: issue with HTMLStripCharFilterFactory

2014-03-15 Thread Dmitry Kan
Hello, The following type does not get analyzed properly on the solr 4.7.0 analysis page: fieldType name=text_en_splitting class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=true analyzer type=index charFilter class=solr.HTMLStripCharFilterFactory/ !--

RE: [solr 4.7.0] analysis page: issue with HTMLStripCharFilterFactory

2014-03-15 Thread Doug Turnbull
The char filter is not broken. There's a bug in 4.7 in the analysis UI: https://issues.apache.org/jira/browse/SOLR-5800 It was unclear to me if it would be part of a 4.7.1 release. I hope so, as it'll probably save people a lot of time from thinking their analyzers are broken. Sent from my

Re: Best practice to support multi-tenant with Solr

2014-03-15 Thread shushuai zhu
Hi Lajos, thanks again.   Your suggestion is to support multi-tenant via collection in a Solr Cloud: putting small tenants in one collection and big tenants in their own collections.   My original question was to find out which approach is better: supporting multi-tenant at collection level

Re: example schema now stores most field values

2014-03-15 Thread Michael Sokolov
Yes, that (https://wiki.apache.org/solr/Atomic_Updates) would make sense; thanks for the insight. -Mike On 3/15/2014 1:07 PM, Yonik Seeley wrote: Perhaps so atomic updates work? -Yonik http://heliosearch.org - solve Solr GC pauses with off-heap filters and fieldcache On Sat, Mar 15, 2014

Re: [solr 4.7.0] analysis page: issue with HTMLStripCharFilterFactory

2014-03-15 Thread Dmitry Kan
Hi Doug, I have tried the patch and seems to have corrected the issue. Thanks for pointing to the jira. Dmitry On Sat, Mar 15, 2014 at 8:05 PM, Doug Turnbull dturnb...@opensourceconnections.com wrote: The char filter is not broken. There's a bug in 4.7 in the analysis UI:

Re: example schema now stores most field values

2014-03-15 Thread Jack Krupansky
Could it be that you had dropped a pre-4.2.1 schema into 4.2.1? I mean, I just exhaustively examined all schema.xml changes between 4.2.1 and 4.6.1 (all 6 of them) and saw no wholesale change to stored=true. Maybe somebody on your end removed a lot of fields from the 4.2.1 release of

Re: Best practice to support multi-tenant with Solr

2014-03-15 Thread Lajos
Hi Shushuai, Yes, as Robi noted, you have to be careful with terminology: core generally refers to the traditional Solr configuration of a single index + configuration on a single node (optionally replicated to others). A collection is a distributed index that is associated with a

RE: Best practice to support multi-tenant with Solr

2014-03-15 Thread Petersen, Robert
Hi Overall I think you are mixing up your terminology. What used to be called a 'core' is now called a 'collection' in solr cloud. In the old master slave setup, you made separate cores and replicated them to all slaves. Now they want you to think of them as collections and let the cloud

Re: Problem adding fields when indexing a pdf (add-on)

2014-03-15 Thread Erick Erickson
Thanks for bringing closure to this, it may very well help the _next_ person to have the same problem. Erick On Thu, Mar 13, 2014 at 6:51 AM, Croci Francesco Luigi (ID SWS) fcr...@id.ethz.ch wrote: Ok. Maybe I found the problem: in the solrconfig.xml I have str name=lowernamestrue/str I

Re: Partial Counts in SOLR

2014-03-15 Thread Erick Erickson
What are our complex queries? You say that your app will very rarely see the same query thus you aren't using caches... But, if you can move some of your clauses to fq clauses, then the filterCache might well be used to good effect. On Thu, Mar 13, 2014 at 7:22 AM, Salman Akram

Re: single node causing cluster-wide outage

2014-03-15 Thread Erick Erickson
right, after an OOM error, the state of the machine may be wonky. So the obvious thing is to try to get rid of the OOM error how many unique values do you have in your field you're faceting on? Not too helpful, but the best I can do. Erick On Thu, Mar 13, 2014 at 3:15 PM, Avishai Ish-Shalom

Re: solr securing index files

2014-03-15 Thread Erick Erickson
This is like allowing many users to access the disk your database is on. Don't do it. If by many users on a server, you mean many users having shell access, well, you have many more problems than securing the Solr index. If you mean you have many users accessing an app that lives on the server,

Master Not Available - Replication

2014-03-15 Thread Newallo, Dexter - DOT
System Config -- Web App Server: WebSphere 7.0 SOLR 4.7.0 Windows 2008R2 solrconfig.xml (replication section on slave) lst name=slave str name=masterUrlhttp://master-IP:9081/solr/#/collection1/str str name=httpBasicAuthUserID/str str

Re: CollapsingQParserPlugin returning different result set

2014-03-15 Thread Joel Bernstein
Hi Shamik, You should be seeing the same result counts with grouping and the CollapsingQParserPlugin, unless there are null values in the collapse field. Let's see if we can figure out what the issue is. Can you post the schema.xml field type definition for the ADSKDedup field? Also can you

Re: [ANNOUNCE] Heliosearch 0.04

2014-03-15 Thread Yonik Seeley
On Fri, Mar 14, 2014 at 1:11 PM, Yonik Seeley yo...@heliosearch.com wrote: I've been meaning to create a heliosearch 4x branch to get us a big step closer to cutting a stable release. I should probably do that sooner rather than later... Done. I've created branch helio which is based on

Re: Master Not Available - Replication

2014-03-15 Thread Erick Erickson
It's almost certain that you don't need the hash (#). That's just a bit in the admin URL that isn't relevant to regular URLs. Best, Erick On Sat, Mar 15, 2014 at 6:54 PM, Newallo, Dexter - DOT dexter.newa...@dot.wi.gov wrote: System Config -- Web App Server: WebSphere 7.0

Re: Best practice to support multi-tenant with Solr

2014-03-15 Thread shushuai zhu
Lajos/Robi, thanks for the answers. For others' convenience, I copied Robi's reply below so this thread contains all discussions. Based on Lajos' detailed comments, there seems no single answer to this question. There are trade-offs between collection level and core level, and one needs to