Authentication and distributed search in 7.2.1

2018-02-28 Thread Peter Sturge
Hi, In 7.2.1 there's the authentication module and associated security.json file, which works well for single cores. (Note: standalone mode, no SolrCloud) It doesn't appear to work with distributed searches, including multi-shard local searches . e.g.

Re: security authentication API via solrj?

2018-02-26 Thread Peter Sturge
/Configset etc. Thanks, Peter On Mon, Feb 26, 2018 at 1:13 AM, Shawn Heisey <elyog...@elyograg.org> wrote: > On 2/25/2018 1:28 PM, Peter Sturge wrote: > >> I was wondering if 7.2.1 solrj had native support for the >> security/authentication endpoint? I couldn't find anyth

security authentication API via solrj?

2018-02-25 Thread Peter Sturge
Hi, I was wondering if 7.2.1 solrj had native support for the security/authentication endpoint? I couldn't find anything in the docs about it, but maybe someone has some experience with it? Note: This is about adding/deleting users on a solr instance using solrj, not authenticating (that is well

Re: q.op in 7.2.1 solconfig.xml

2018-02-21 Thread Peter Sturge
ning - http://sematext.com/ > > > > > On 21 Feb 2018, at 23:27, Peter Sturge <peter.stu...@gmail.com> wrote: > > > > Hi, > > I'm going through a major upgrade from 4.6 to 7.2.1 and I can see the > > defaultOperator has now been removed. > > >

q.op in 7.2.1 solconfig.xml

2018-02-21 Thread Peter Sturge
Hi, I'm going through a major upgrade from 4.6 to 7.2.1 and I can see the defaultOperator has now been removed. The docs mention it's possible to set a default value for the new q.op directive in solrconfig.xml, but it doesn't say how or where. Does anyone have an example of specifying a default

Re: Java profiler?

2017-12-06 Thread Peter Sturge
Hi, We'be been using JPRofiler (www.ej-technologies.com) for years now. Without a doubt, the most comprehensive and useful profiler for java. Works very well, supports remote profiling and includes some very neat heap walking/gc profiling. Peter On Tue, Dec 5, 2017 at 3:21 PM, Walter Underwood

Re: MongoDb vs Solr

2017-08-05 Thread Peter Sturge
*And insults are not something I'd like to see in this mailing list, at all* +1 Everyone is entitled to their opinion.. Solr can and does work extremely well as a database - it depends on your db requirements. For distributed/replicated search via REST API that is read heavy, Solr is a great

Re: Grouping facets: Possible to get facet results for each Group?

2015-10-15 Thread Peter Sturge
yonik.com/solr-subfacets/ > > On 14 October 2015 at 22:12, Peter Sturge <peter.stu...@gmail.com> wrote: > > > Yes, you are right about that - I've used pivots before and they do need > to > > be used judiciously. > > Fortunately, we only ever use single-value field

Re: Grouping facets: Possible to get facet results for each Group?

2015-10-14 Thread Peter Sturge
classic flat document structure, the sub > facet are working with any nested structure. > So be careful about pivot faceting in a flat document with multi valued > fields, because you lose the relation across the different fields value. > > Cheers > > On 13 October 2015

Re: Grouping facets: Possible to get facet results for each Group?

2015-10-13 Thread Peter Sturge
yntax? > > http://yonik.com/solr-subfacets/ > > > > Regards, > >Alex. > > > > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > > http://www.solr-start.com/ > > > > On 11 October 2015 at 09:51, Peter Sturge <peter.stu...@g

Fwd: Grouping facets: Possible to get facet results for each Group?

2015-10-12 Thread Peter Sturge
Hello Solr Forum, Been trying to coerce Group faceting to give some faceting back for each group, but maybe this use case isn't catered for in Grouping? : So the Use Case is this: Let's say I do a grouped search that returns say, 9 distinct groups, and in these groups are various numbers of

Grouping facets: Possible to get facet results for each Group?

2015-10-11 Thread Peter Sturge
Hello Solr Forum, Been trying to coerce Group faceting to give some faceting back for each group, but maybe this use case isn't catered for in Grouping? : So the Use Case is this: Let's say I do a grouped search that returns say, 9 distinct groups, and in these groups are various numbers of

Re: Basic Auth (again)

2015-07-23 Thread Peter Sturge
if anyone has setup Solr 5.2.1 or any 5.x with Basic Auth and got it working, I have not heard back. Either this feature is not tested or not in use. If it is not in use, how do folks secure their Solr instance? Thanks Steve On Thu, Jul 23, 2015 at 2:52 PM, Peter Sturge peter.stu

Re: Basic Auth (again)

2015-07-23 Thread Peter Sturge
on to debug this? Should I be posting this issue on the Jetty mailing list? Steve On Wed, Jul 22, 2015 at 10:34 AM, Peter Sturge peter.stu...@gmail.com wrote: Try adding the start call in your jetty.xml: Set name=nameRealm Name/Set Set name

Re: Basic auth

2015-07-22 Thread Peter Sturge
if you're using Jetty you can use the standard realms mechanism for Basic Auth, and it works the same on Windows or UNIX. There's plenty of docs on the Jetty site about getting this working, although it does vary somewhat depending on the version of Jetty you're running (N.B. I would suggest using

Re: Basic auth

2015-07-22 Thread Peter Sturge
Try adding the start call in your jetty.xml: Set name=nameRealm Name/Set Set name=configSystemProperty name=jetty.home default=.//etc/realm.properties/Set Set name=refreshInterval5/Set Call name=start/Call On Wed, Jul 22, 2015 at 2:53 PM, O. Klein

Re: How large is your solr index?

2015-01-07 Thread Peter Sturge
control of the document routing, but... that's quite tricky. You forever after have to send any _updates_ to the same shard you did the first time, whereas SPLITSHARD will do the right thing. On Tue, Jan 6, 2015 at 10:33 AM, Peter Sturge peter.stu...@gmail.com wrote: ++1

Re: How large is your solr index?

2015-01-06 Thread Peter Sturge
Yes, totally agree. We run 500m+ docs in a (non-cloud) Solr4, and it even performs reasonably well on commodity hardware with lots of faceting and concurrent indexing! Ok, you need a lot of RAM to keep faceting happy, but it works. ++1 for the automagic shard creator. We've been looking into

Re: Get matched Term in join query

2014-12-09 Thread Peter Sturge
)? 09.12.2014 1:23 пользователь Peter Sturge peter.stu...@gmail.com написал: Hi Forum, Is it possible for a Solr query to return the term(s) that matched a particular field/query? For example, let's say there's a field like this: raw=This is a raw text field that happens to contain some text

Get matched Term in join query

2014-12-08 Thread Peter Sturge
Hi Forum, Is it possible for a Solr query to return the term(s) that matched a particular field/query? For example, let's say there's a field like this: raw=This is a raw text field that happens to contain some text that's also in the action field value... And another field in a different index

Handling intersection facets of many values

2014-11-19 Thread Peter Sturge
Hi Solr Group, Got an interesting use case (to me, at least), perhaps someone could give some insight on how best to achieve this? I've got a core that has about 7million entries, with a field call 'addr'. By definition, every entry has a unique 'addr' value, so there are 7million unique values

Re: Handling intersection facets of many values

2014-11-19 Thread Peter Sturge
them and returning the result: List 1: a b c d e f List 2: a a g z c c c e Resultant intersection: a (2) c (3) e (1) Thanks, Peter On Wed, Nov 19, 2014 at 7:16 PM, Toke Eskildsen t...@statsbiblioteket.dk wrote: Peter Sturge [peter.stu...@gmail.com] wrote: [addr 7M unique, dest 1K unique

Re: Handling intersection facets of many values

2014-11-19 Thread Peter Sturge
, Toke Eskildsen t...@statsbiblioteket.dk wrote: Peter Sturge [peter.stu...@gmail.com] wrote: I guess you mean take the 1k or so values and build a boolean query from them? Not really. Let me try again: 1) Perform a facet call with facet.limit=-1 on dest to get the relevant dest values

Re: Facet sort descending

2013-09-10 Thread Peter Sturge
Hi, This question could possibly be about rare idr facet counting - i.e. retrun the facets counts with the least values. I remember doing a patch for this years ago, but then it broke when some UninvertedField facet optimization came in around ~3.5 time. It's a neat idea though to have an option

Re: Facet sort descending

2013-09-10 Thread Peter Sturge
desc to the sort option like facet.sort=index,desc to get the following result lst name=facet_fields lst name=image_text int name=c200/int int name=b23/int int name=a12/int /lst /lst Bests Sandro -Ursprüngliche Nachricht- Von: Peter Sturge [mailto:peter.stu

Re: Facet field display name

2013-08-12 Thread Peter Sturge
2c worth, We do lots of facet lookups to allow 'prettyprint' versions of facet names. We do this on the client-side, though. The reason is then the lookups can be different for different locations/users etc. - makes it easy for localization. It's also very easy to implement such a lookup, without

Re: Applying Sum on Field

2013-07-11 Thread Peter Sturge
Hi, If you mean adding up numeric values stored in fields - no, Solr doesn't do this by default. We had a similar requirement for this, and created a custom SearchComponent to handle sum, average, stats etc. There are a number of things you need to bear in mind, such as: * Handling errors when

Re: Two instances of solr - the same datadir?

2013-07-03 Thread Peter Sturge
, Peter Sturge peter.stu...@gmail.com wrote: The RO instance commit isn't (or shouldn't be) doing any real writing, just an empty commit to force new searchers, autowarm/refresh caches etc. Admittedly, we do all this on 3.6, so 4.0 could have different behaviour in this area. As long

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Peter Sturge
be definitely cheaper... roman On Wed, Jun 5, 2013 at 4:03 AM, Peter Sturge peter.stu...@gmail.com wrote: Hi, We use this very same scenario to great effect - 2 instances using the same dataDir with many cores - 1 is a writer (no caching), the other

Re: Two instances of solr - the same datadir?

2013-07-02 Thread Peter Sturge
into index, so unless there is something deep inside, which automatically calls the commit, it should never happen. roman On Tue, Jul 2, 2013 at 2:54 PM, Peter Sturge peter.stu...@gmail.com wrote: Hmmm, single lock sounds dangerous. It probably works ok because you've been [un]lucky

Re: Improving performance to return 2000+ documents

2013-06-29 Thread Peter Sturge
Hello Utkarsh, This may or may not be relevant for your use-case, but the way we deal with this scenario is to retrieve the top N documents 5,10,20or100 at a time (user selectable). We can then page the results, changing the start parameter to return the next set. This allows us to 'retrieve'

Re: Two instances of solr - the same datadir?

2013-06-05 Thread Peter Sturge
Hi, We use this very same scenario to great effect - 2 instances using the same dataDir with many cores - 1 is a writer (no caching), the other is a searcher (lots of caching). To get the searcher to see the index changes from the writer, you need the searcher to do an empty commit - i.e. you

Re: Sharing index data between two Solr instances

2013-05-10 Thread Peter Sturge
Hello Milen, We do something very similar to this, except we use separate processes on the same machine for the writer and reader. We do this so we can tune caches etc. to optimize for each, and still use the same index files. On MP machines, this works very well. If you've got 2 separate

Re: Scaling Solr on VMWare

2013-04-17 Thread Peter Sturge
Hi, We have run solr in VM environments extensively (3.6 not Cloud, but the issues will be similar). There are some significant things to be aware of when running Solr in a virtualized environment (these can be equally true with Hyper-V and Xen as well): If you're doing heavy indexing, the

Re: Selective field level security

2012-09-17 Thread Peter Sturge
Hi, Solr doesn't have any built-in mechanism for document/field level security - basically it's delegated to the container to provide security, but this of course won't apply to specific documents and/or fields. There are are a lot of ways to skin this cat, some bits of which have been covered by

Re: solr 1872

2012-07-31 Thread Peter Sturge
: Renamed to zip and worked fine,thanks Regards Sujatha On Tue, Jul 31, 2012 at 9:15 AM, Sujatha Arun suja.a...@gmail.com wrote: thanks ,was looking to the rar file for instructions on set up . Regards Sujatha On Tue, Jul 31, 2012 at 1:07 AM, Peter Sturge peter.stu

Re: solr 1872

2012-07-30 Thread Peter Sturge
I can access the rar fine with WinRAR, so should be ok, but yes, it might be in zip format. In any case, better to use the slightly later version -- SolrACLSecurity.java 26kb 12 Apr 2010 10:35 Thanks, Peter On Mon, Jul 30, 2012 at 7:50 PM, Sujatha Arun suja.a...@gmail.com wrote: I am uable

Re: Determining which shard is failing using partialResults / some other technique?

2012-01-15 Thread Peter Sturge
Hi, There are a couple ways of handling this. One is to do it from the 'client' side - i.e. do a Solr ping to each shard beforehand to find out which/if any shards are unavailable. This may not always work if you use forwarders/proxies etc. What we do is add the name of all failed shards to the

Re: Faceting Question

2012-01-15 Thread Peter Sturge
Hi, It's quite coincidental that I was just about to ask this very question to the forum experts. I think this is the same sort of thing Jamie was asking about. (the only difference in my question is that the values won't be known at query time) Is it possible to create a request that will

Highlighting and regex

2011-11-17 Thread Peter Sturge
Hi, Been wrestling with a question on highlighting (or not) - perhaps someone can help? The question is this: Is it possible, using highlighting or perhaps another more suited component, to return words/tokens from a stored field based on a regular expression's capture groups? What I was kind

Re: SSD experience

2011-08-23 Thread Peter Sturge
Just to add a few cents worth regarding SSD... We use Vertex SSD drives for storing indexes, and wow, they really scream compared to SATA/SAS/SAN. As we do some heavy commits, it's the commit times where we see the biggest performance boost. In tests, we found that locally attached 15k SAS drives

Re: SSD experience

2011-08-23 Thread Peter Sturge
alternative as well. Peter On Tue, Aug 23, 2011 at 3:29 PM, Gerard Roos l...@gerardroos.nl wrote: Interesting. Do you make a symlink to the indexes or is the whole Solr directory on SSD? thanks, Gerard Op 23 aug. 2011, om 12:53 heeft Peter Sturge het volgende geschreven: Just to add

Re: SSD experience

2011-08-23 Thread Peter Sturge
Grinovero sanne.grinov...@gmail.com wrote: Indeed I would never actually use it, but symlinks do exist on Windows. http://en.wikipedia.org/wiki/NTFS_symbolic_link Sanne 2011/8/23 Peter Sturge peter.stu...@gmail.com: The Solr index directory lives directly on the SSD (running on Windows - where

Re: exceeded limit of maxWarmingSearchers ERROR

2011-08-14 Thread Peter Sturge
It's worth noting that the fast commit rate is only an indirect part of the issue you're seeing. As the error comes from cache warming - a consequence of committing, it's not the fault of commiting directly. It's well worth having a good close look at exactly what you're caches are doing when they

Re: LockObtainFailedException

2011-08-11 Thread Peter Sturge
Hi, When you get this exception with no other error or explananation in the logs, this is almost always because the JVM has run out of memory. Have you checked/profiled your mem usage/GC during the stream operation? On Thu, Aug 11, 2011 at 3:18 AM, Naveen Gupta nkgiit...@gmail.com wrote: Hi,

Re: LockObtainFailedException

2011-08-11 Thread Peter Sturge
, after deleting the index data, it is taking 9 secs What would be approach to have better indexing performance as well as index size should also at the same time. The index size was around 4.5 GB Thanks Naveen On Thu, Aug 11, 2011 at 3:47 PM, Peter Sturge peter.stu...@gmail.comwrote: Hi

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-17 Thread Peter Sturge
You'll need to be a bit careful using joins, as the performance hit can be significant if you have lots of cross-referencing to do, which I believe you would given your scenario. Your table could be setup to use the username as the key (for fast lookup), then map these to your own data class or

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-14 Thread Peter Sturge
Hi, SOLR-1834 is good when the original documents' ACL is accessible. SOLR-1872 is good where the usernames are persistent - neither of these really fit your use case. It sounds like you need more of an 'in-memory', transient access control mechanism. Does the access have to exist beyond the

Re: Document Level Security (SOLR-1872 ,SOLR,SOLR-1834)

2011-06-14 Thread Peter Sturge
, Peter Sturge peter.stu...@gmail.comwrote: Hi, SOLR-1834 is good when the original documents' ACL is accessible. SOLR-1872 is good where the usernames are persistent - neither of these really fit your use case. It sounds like you need more of an 'in-memory', transient access control mechanism

Re: [POLL] How do you (like to) do logging with Solr

2011-05-16 Thread Peter Sturge
[X] I always use the JDK logging as bundled in solr.war, that's perfect [ ] I sometimes use log4j or another framework and am happy with re-packaging solr.war [ ] Give me solr.war WITHOUT an slf4j logger binding, so I can choose at deploy time [ ] Let me choose whether to bundle a binding

Re: DIH for e-mails

2011-05-05 Thread Peter Sturge
The best way to add your own fields is to create a custom Transformer sub-class. See: http://www.lucidimagination.com/search/out?u=http%3A%2F%2Fwiki.apache.org%2Fsolr%2FDataImportHandler This will guide you through the steps. Peter 2011/5/5 方振鹏 michong900...@xmu.edu.cn: I’m using Data

Re: Trying to Post. Emails rejected as spam.

2011-04-07 Thread Peter Sturge
This happens almost always because you're sending from a 'free' mail account (gmail, yahoo, hotmail, etc), and your message contains words that spam filters don't like. For me, it was the use of the word 'remplica' (deliberately mis-spelled so this mail gets sent). It can also happen from

Re: Exception on distributed date facet SOLR-1709

2011-03-18 Thread Peter Sturge
Hi Viswa, This patch was orignally built for the 3x branch, and I don't see any ported patch revision or testing for trunk. A lot has changed in faceting from 3x to trunk, so it will likely need a bit of adjusting to cater for these changes (e.g. deprecation of date range in favour of range).

Re: problem using dataimporthandler

2011-03-15 Thread Peter Sturge
Could possibly be your original xml file was in unicode (with a BOM header - FFFE or FEFF) - xml will see it as content if the underlying file system doesn't handle it. On Tue, Mar 15, 2011 at 10:00 PM, sivaram yogendra.bopp...@gmail.com wrote: I got rid of the problem by just copying the other

Re: Math-generated fields during query

2011-03-10 Thread Peter Sturge
at 10:06 PM, Peter Sturge peter.stu...@gmail.com wrote: Hi, I was wondering if it is possible during a query to create a returned field 'on the fly' (like function query, but for concrete values, not score). For example, if I input this query:   q=_val_:product(15,3)fl=*,score For every

Re: Help -DIH (mail)

2011-03-09 Thread Peter Sturge
-import /str str name=statusidle/str str name=importResponse/ lst name=statusMessages/ - str name=WARNING This response format is experimental.  It is likely to change in the future. /str /response Thank you for your help. Matias. 2011/3/4 Peter Sturge peter.stu...@gmail.com Can

Re: Help -DIH (mail)

2011-03-09 Thread Peter Sturge
11:54:58 org.apache.solr.core.SolrCore execute INFO: [mail] webapp=/solr path=/dataimport params={command=status} status=0 QTime=0 Thks, Matias. 2011/3/9 Peter Sturge peter.stu...@gmail.com Hi, You've included some output in your message, so I presume something *did* happen when

Math-generated fields during query

2011-03-09 Thread Peter Sturge
Hi, I was wondering if it is possible during a query to create a returned field 'on the fly' (like function query, but for concrete values, not score). For example, if I input this query: q=_val_:product(15,3)fl=*,score For every returned document, I get score = 45. If I change it slightly

Solr chained exclusion query

2011-03-04 Thread Peter Sturge
Hello, I've been wrestling with a query use case, perhaps someone has done this already? Is it possible to write a query that excludes results based on another query? Scenario: I have an index that holds: 'customer' (textgen) 'product' (textgen) 'saledate' (date) I'm looking to

Re: Solr chained exclusion query

2011-03-04 Thread Peter Sturge
and so your query would be something like: q=products:Dog AND saledate:[* TO 20011-02-04T00:00:00Z] On 4 March 2011 11:40, Peter Sturge peter.stu...@gmail.com wrote: Hello, I've been wrestling with a query use case, perhaps someone has done this already? Is it possible to write a query

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
Hi, You need to put your password in as well. You should use protocol=imap unless your gmail is set for imaps (I don't believe the free gmail gives you this). entity name=email user=u...@mydomain.com password=userpwd host=imap.mydomain.com include= exclude=

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
mesange I posted. Matias. 2011/3/4 Peter Sturge peter.stu...@gmail.com Hi, You need to put your password in as well. You should use protocol=imap unless your gmail is set for imaps (I don't believe the free gmail gives you this). entity name=email user=u

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
=MailEntityProcessor protocol=imaps / /document /dataConfig 2011/3/4 Peter Sturge peter.stu...@gmail.com Hi Matias, Can you post your data-config.xml? (with disquised names/credentials) Thanks, Peter On Fri, Mar 4, 2011 at 5:13 PM, Matias Alonso matiasgalo...@gmail.com

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
includeOtherUserFolders=false includeSharedFolders=false batchSize=100 processor=MailEntityProcessor protocol=imaps / /document /dataConfig 2011/3/4 Peter Sturge peter.stu...@gmail.com Hi Matias, Can you post your data-config.xml? (with disquised names

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
execute http://localhost:8983/solr/mail/dataimport?command=full-import; but nothing happends; no index; no errors. thks... Matias. 2011/3/4 Peter Sturge peter.stu...@gmail.com Hi Mataias, http://localhost:8983/solr/mail/admin/dataimport.jsp?handler=/dataimportaccesses

Re: Help -DIH (mail)

2011-03-04 Thread Peter Sturge
). (you won't see any errors unless you run the status command - that's where they're stored) HTH Peter On Sat, Mar 5, 2011 at 12:46 AM, Matias Alonso matiasgalo...@gmail.comwrote: I´m using the trunk. Thanks Peter for your preoccupation! Matias. 2011/3/4 Peter Sturge peter.stu

Re: Separating Index Reader and Writer

2011-02-06 Thread Peter Sturge
Hi, We use this scenario in production where we have one write-only Solr instance and 1 read-only, pointing to the same data. We do this so we can optimize caching/etc. for each instance for write/read. The main performance gain is in cache warming and associated parameters. For your Index W,

Re: Document level security

2011-01-20 Thread Peter Sturge
Hi, One of the things about Document Security is that it never involves just one thing. There are a lot of things to consider, and unfortunately, they're generally non-trivial. Deciding how to store/hold/retrieve permissions is certainly one of those things, and you're right, you should avoid

Re: How to implement and a system based on IMAP auth

2010-12-13 Thread Peter Sturge
imap has no intrinsic functionality for logging in as a user then 'impersonating' someone else. What you can do is setup your email server so that your administrator account or similar has access to other users via shared folders (this is supported in imap2 servers - e.g. Exchange). This is done

Re: SOLR Thesaurus

2010-12-10 Thread Peter Sturge
Hi Lee, Perhaps Solr's clustering component might be helpful for your use case? http://wiki.apache.org/solr/ClusteringComponent On Fri, Dec 10, 2010 at 9:17 AM, lee carroll lee.a.carr...@googlemail.com wrote: Hi Chris, Its all a bit early in the morning for this mined :-) The question

Re: How badly does NTFS file fragmentation impact search performance? 1.1X? 10X? 100X?

2010-12-08 Thread Peter Sturge
There are, as you would expect, a lot of factors that impact the amount of fragmentation that occurs: commit rate, mergeFactor updates/deletes vs 'new' data etc. Having run reasonably large indexes on NTFS (25GB), we've not found fragmentation to be much of a hindrance. I don't have any

Re: Preventing index segment corruption when windows crashes

2010-12-02 Thread Peter Sturge
On Thu, Dec 2, 2010 at 4:07 AM, Lance Norskog goks...@gmail.com wrote: Is there any way that Windows 7 and disk drivers are not honoring the fsync() calls? That would cause files and/or blocks to get saved out of order. On Tue, Nov 30, 2010 at 3:24 PM, Peter Sturge peter.stu...@gmail.com wrote

Re: Preventing index segment corruption when windows crashes

2010-12-02 Thread Peter Sturge
. Server 2008 or Win7)? Mike, are there any diagnostics/config etc. that I could try to help isolate the problem? Many thanks, Peter On Thu, Dec 2, 2010 at 9:28 AM, Michael McCandless luc...@mikemccandless.com wrote: On Thu, Dec 2, 2010 at 4:10 AM, Peter Sturge peter.stu...@gmail.com wrote

Re: Tuning Solr caches with high commit rates (NRT)

2010-12-02 Thread Peter Sturge
In order for the 'read-only' instance to see any new/updated documents, it needs to do a commit (since it's read-only, it is a commit of 0 documents). You can do this via a client service that issues periodic commits, or use autorefresh from within solrconfig.xml. Be careful that you don't do

Re: Preventing index segment corruption when windows crashes

2010-11-30 Thread Peter Sturge
...@lucidimagination.com wrote: On Mon, Nov 29, 2010 at 10:46 AM, Peter Sturge peter.stu...@gmail.com wrote: If a Solr index is running at the time of a system halt, this can often corrupt a segments file, requiring the index to be -fix'ed by rewriting the offending file. Really?  That shouldn't

Re: SOLR for Log analysis feasibility

2010-11-30 Thread Peter Sturge
We do a lot of precisely this sort of thing. Ours is a commercial product (Honeycomb Lexicon) that extracts behavioural information from logs, events and network data (don't worry, I'm not pushing this on you!) - only to say that there are a lot of considerations beyond base Solr when it comes to

Re: Preventing index segment corruption when windows crashes

2010-11-30 Thread Peter Sturge
this seem right? I don't remember seeing so many corruptions in the index - maybe it is the world of Win7 dodgy drivers, but it would be worth investigating if there's something amiss in Solr/Lucene when things go down unexpectedly... Thanks, Peter On Tue, Nov 30, 2010 at 9:19 AM, Peter Sturge peter.stu

Preventing index segment corruption when windows crashes

2010-11-29 Thread Peter Sturge
Hi, With the advent of new windows versions, there are increasing instances of system blue-screens, crashes, freezes and ad-hoc failures. If a Solr index is running at the time of a system halt, this can often corrupt a segments file, requiring the index to be -fix'ed by rewriting the offending

Re: SOLR and secure content

2010-11-23 Thread Peter Sturge
Yes, as mentioned in the above link, there's SOLR-1872 for maintaing your own document-level access control. Also, if you have access to the file system documents and want to use their existing ACL, have a look at SOLR-1834. Document-level access control can be a real 'can of worms', and it can be

RE: DataImportHandlerException for custom DIH Transformer

2010-11-19 Thread Peter Sturge
Hi, This problem is usually because your custom Transformer is in the solr/lib folder, when it needs to be in the webapps .war file (under WEB-INF/lib of course). Place your custom Transformer in a .jar in your .war and you should be good to go. Thanks, Peter Subject: RE:

Re: Possibilities of (near) real time search with solr

2010-11-18 Thread Peter Sturge
Maybe I didn't fully understood what you explained: but doesn't this mean that you'll have one index per day? Or are you overwriting, via replicating, every shard and the number of shard is fixed? And why are you replicating from the local replica to the next shard? (why not directly from

Re: Possibilities of (near) real time search with solr

2010-11-18 Thread Peter Sturge
no, I only thought you use one day :-) so you don't or do you have 31 shards? No, we use 1 shard per month - e.g. 7 shards will hold 7 month's of data. It can be set to 1 day, but you would need to have a huge amount of data in a single day to warrant doing that. On Thu, Nov 18, 2010 at

Re: Possibilities of (near) real time search with solr

2010-11-17 Thread Peter Sturge
* I believe the NRT patches are included in the 4.x trunk. I don't think there's any support as yet in 3x (uses features in Lucene 3.0). * For merging, I'm talking about commits/writes. If you merge while commits are going on, things can get a bit messy (maybe on source cores this is ok, but I

Re: Tuning Solr caches with high commit rates (NRT)

2010-11-16 Thread Peter Sturge
Many thanks, Peter K. for posting up on the wiki - great! Yes, fc = field cache. Field Collapsing is something very nice indeed, but is entirely different. As Erik mentions in the wiki post, using per-segment faceting can be a huge boon to performance. It does require the latest Solr trunk build

Re: Possibilities of (near) real time search with solr

2010-11-16 Thread Peter Sturge
Hi Peter, First off, many thanks for putting together the NRT Wiki page! This may have changed recently, but the NRT stuff - e.g. per-segment commits etc. is for the latest Solr 4 trunk only. If your setup uses the 3x Solr code branch, then there's a bit of work to do to move to the new version.

Re: Modelling Access Control

2010-10-24 Thread Peter Sturge
Hi, See SOLR-1872 for a way of providing access control, whilst placing the ACL configuration itself outside of Solr, which is generally a good idea. http://www.lucidimagination.com/search/out?u=http://issues.apache.org/jira/browse/SOLR-1872 There are a number of ways to approach Access

Spanning an index across multiple volumes

2010-10-17 Thread Peter Sturge
Is it possible to get an index to span multiple disk volumes - i.e. when its 'primary' volume fills up (or optimize needs more room), tell Solr/Lucene to use a secondary/tertiary/quaternary et al volume? I've not seen any configuration that would allow this, but maybe others have a use case for

Re: Experience running Solr on ISCSI

2010-10-08 Thread Peter Sturge
Hi, We've used iSCSI SANs with 6x1TB 15k SAS drives RAID10 in production environments, and this works very well for both reads and writes. We also have FibreChannel environments, and this is faster as you would expect. It's also a lot more expensive. The performance bottleneck will have more to

Re: Question Related to sorting on Date

2010-09-27 Thread Peter Sturge
Hi Ahson, You'll really want to store an additional date field (make it a TrieDateField type) that has only the date, and in the reverse order from how you've shown it. You can still keep the one you've got, just use it only for 'human viewing' rather than sorting. Something like: 20080205 if

Re: Solr Reporting

2010-09-23 Thread Peter Sturge
Hi, Are you going to generate a report with 3 records in it? That will be a very large report - will anyone really want to read through that? If you want/need 'summary' reports - i.e. stats on on the 30k records, it is much more efficient to setup faceting and/or server-side analysis to do

Re: Solr Reporting

2010-09-23 Thread Peter Sturge
data transfer but its the full data dump reports that I am trying to figure out the best way to handle. Thanks for your help Adeel On Thu, Sep 23, 2010 at 11:43 AM, Peter Sturge peter.stu...@gmail.comwrote: Hi, Are you going to generate a report with 3 records

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Peter Sturge
. This is a prudent move -- thanks Chris for bringing this up! All the best, Peter On Tue, Sep 14, 2010 at 2:00 PM, Peter Karich peat...@yahoo.de wrote: Peter Sturge, this was a nice hint, thanks again! If you are here in Germany anytime I can invite you to a beer or an apfelschorle ! :-) I only

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-17 Thread Peter Sturge
, what is NRT? Dennis Gearon Signature Warning EARTH has a Right To Life,   otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Fri, 9/17/10, Peter Sturge peter.stu...@gmail.com wrote: From: Peter Sturge

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
, so you can do larger controlled merges. Peter Sturge wrote: Hi, Below are some notes regarding Solr cache tuning that should prove useful for anyone who uses Solr with frequent commits (e.g.5min). Environment: Solr 1.4.1 or branch_3x trunk. Note the 4.x trunk has lots of neat new features

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
1. You can run multiple Solr instances in separate JVMs, with both having their solr.xml configured to use the same index folder. You need to be careful that one and only one of these instances will ever update the index at a time. The best way to ensure this is to use one for writing only, and

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
:26 PM, Peter Sturge peter.stu...@gmail.comwrote: Hi, Below are some notes regarding Solr cache tuning that should prove useful for anyone who uses Solr with frequent commits (e.g. 5min). Environment: Solr 1.4.1 or branch_3x trunk. Note the 4.x trunk has lots of neat new features, so

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-13 Thread Peter Sturge
faceting theoretically can easily be applied to multiple segments, however the way it's written for me is a challenge to untangle and apply successfully to a working patch.  Also I don't have this as an itch to scratch at the moment. On Sun, Sep 12, 2010 at 7:18 PM, Peter Sturge peter.stu

Re: Invalid version or the data in not in 'javabin' format

2010-09-12 Thread Peter Sturge
Could be a solrj .jar version compat issue. Check that the client and server's solrj version jars match up. Peter On Sun, Sep 12, 2010 at 1:16 PM, h00kpub...@gmail.com h00kpub...@googlemail.com wrote:  hi... currently i am integrating nutch (release 1.2) into solr (trunk). if i indexing to

Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Sturge
Hi, Below are some notes regarding Solr cache tuning that should prove useful for anyone who uses Solr with frequent commits (e.g. 5min). Environment: Solr 1.4.1 or branch_3x trunk. Note the 4.x trunk has lots of neat new features, so the notes here are likely less relevant to the 4.x

Re: Tuning Solr caches with high commit rates (NRT)

2010-09-12 Thread Peter Sturge
, eg, SOLR-1617?  That could help your situation. On Sun, Sep 12, 2010 at 12:26 PM, Peter Sturge peter.stu...@gmail.com wrote: Hi, Below are some notes regarding Solr cache tuning that should prove useful for anyone who uses Solr with frequent commits (e.g. 5min). Environment: Solr 1.4.1

  1   2   >