Unicode bug in python client code

2008-02-01 Thread James Brady
Hi all, I was adding passing python unicode objects to solr.add and got these sort of errors: ... File /Users/jamesbrady/Documents/workspace/YelServer/yel/ solr.py, line 152, in add self.__add(lst,fields) File /Users/jamesbrady/Documents/workspace/YelServer/yel/ solr.py, line 146,

Performance help for heavy indexing workload

2008-02-11 Thread James Brady
Hello, I'm looking for some configuration guidance to help improve performance of my application, which tends to do a lot more indexing than searching. At present, it needs to index around two documents / sec - a document being the stripped content of a webpage. However, performance was

Fwd: Performance help for heavy indexing workload

2008-02-12 Thread James Brady
. This seems to be the case for every sort option except score asc and score desc. Please tell me Solr doesn't sort all matching documents before applying boolean filters? James Begin forwarded message: From: James Brady [EMAIL PROTECTED] Date: 11 February 2008 23:38:16 GMT-08:00 To: solr-user

Re: Performance help for heavy indexing workload

2008-02-12 Thread James Brady
strategy in general, and has anyone got advice on the specific points I raise above? Thanks, James On 12 Feb 2008, at 11:45, Mike Klaas wrote: On 11-Feb-08, at 11:38 PM, James Brady wrote: Hello, I'm looking for some configuration guidance to help improve performance of my application, which

Bug fix for Solr Python bindings

2008-02-19 Thread James Brady
Hi, Currently, the solr.py Python binding casts all key and value arguments blindly to strings. The following changes deal with Unicode properly and respect multi-valued parameters passed in as lists: 131a132,142 def __makeField(self, lst, f, v): if not isinstance(f, basestring):

Re: will hardlinks work across partitions?

2008-02-24 Thread James Brady
Unfortunately, you cannot hard link across mount points. Snapshooter uses cp -lr, which, on my Linux machine at least, fails with: cp: cannot create link `/mnt2/myuser/linktest': Invalid cross-device link James On 23 Feb 2008, at 14:34, Brian Whitman wrote: Will the hardlink snapshot

Strategy for handling large (and growing) index: horizontal partitioning?

2008-02-27 Thread James Brady
Hi all, Our current setup is a master and slave pair on a single machine, with an index size of ~50GB. Query and update times are still respectable, but commits are taking ~20% of time on the master, while our daily index optimise can up to 4 hours... Here's the most relevant part of

Re: Strategy for handling large (and growing) index: horizontal partitioning?

2008-02-28 Thread James Brady
be faster than that, since it must read every byte and write a whole new set. Disc speed may be your bottleneck. You could also look at disc access rates in a monitoring tool. Is there read contention between the master and slave for the same disc? wunder On 2/27/08 7:08 PM, James Brady [EMAIL

Re: Strategy for handling large (and growing) index: horizontal partitioning?

2008-02-28 Thread James Brady
From: James Brady [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, February 27, 2008 10:08:02 PM Subject: Strategy for handling large (and growing) index: horizontal partitioning? Hi all, Our current setup is a master and slave pair on a single machine, with an index size

Re: Strategy for handling large (and growing) index: horizontal partitioning?

2008-03-03 Thread James Brady
find them. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: James Brady [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Friday, February 29, 2008 1:11:07 AM Subject: Re: Strategy for handling large (and growing) index

Favouring recent matches

2008-03-08 Thread James Brady
Hello all, In Lucene in Action, (replicated here: http://www.theserverside.com/tt/articles/article.tss?l=ILoveLucene) , theserverside.com team say The date boost has been really important for us. I'm looking for some advice on the best way to actually implement this - the only way I can

Fwd: Favouring recent matches

2008-03-08 Thread James Brady
be great. James Begin forwarded message: From: James Brady [EMAIL PROTECTED] Date: 8 March 2008 19:41:56 PST To: solr-user@lucene.apache.org Subject: Favouring recent matches Hello all, In Lucene in Action, (replicated here: http://www.theserverside.com/tt/articles/article.tss?l=ILoveLucene

Default core in multi-core

2008-04-21 Thread James Brady
Hi all, In the latest trunk version, default='true' doesn't have the effect I would have expected running in multi core mode. The example multicore.xml has: core name=core0 instanceDir=core0 default=true/ core name=core1 instanceDir=core1 / But queries such as /solr/select?q=*:* and

Master / slave setup with multicore

2008-04-29 Thread James Brady
Hi all, I'm aiming to use the new multicore features in development versions of Solr. My ideal setup would be to have master / slave servers on the same machine, snapshotting across from the 'write' to the 'read' server at intervals. This was all fine with Solr 1.2, but the rsync

Re: Queuing adds and commits

2008-04-29 Thread James Brady
Depending on your application, it might be useful to take control of the queueing yourself: it was for me! I needed quick turnarounds for submitting a document to be indexed, which Solr can't guarantee right now. To address it, I wrote a persistent queueing server, accessed by XML-RPC,

Re: Master / slave setup with multicore

2008-05-02 Thread James Brady
? Anything in the snapinstaller log? Bill On Thu, May 1, 2008 at 8:35 PM, James Brady [EMAIL PROTECTED] wrote: Hi Ryan, thanks for that! I have one outstanding question: when I take a snapshot on the master, snappull and snapinstall on the slave, the new index is not being used

IOException: Mark invalid while analyzing HTML

2008-05-04 Thread James Brady
Hi, I'm seeing a problem mentioned in Solr-42, Highlighting problems with HTMLStripWhitespaceTokenizerFactory: https://issues.apache.org/jira/browse/SOLR-42 I'm indexing HTML documents, and am getting reams of Mark invalid IOExceptions: SEVERE: java.io.IOException: Mark invalid at

Re: Solr feasibility with terabyte-scale data

2008-05-09 Thread James Brady
Hi, we have an index of ~300GB, which is at least approaching the ballpark you're in. Lucky for us, to coin a phrase we have an 'embarassingly partitionable' index so we can just scale out horizontally across commodity hardware with no problems at all. We're also using the multicore

Re: Solr feasibility with terabyte-scale data

2008-05-09 Thread James Brady
getting it right but this is what I'm good at; to never stop trying :) However it is nice to start playing at least on the right side of the football field so a little push in the back would be really helpful. Kindly //Marcus On Fri, May 9, 2008 at 9:36 AM, James Brady [EMAIL PROTECTED

Multicore capability: dynamically creating 1000s of cores?

2008-05-16 Thread James Brady
Hi, there was some talk on JIRA about whether Multicore would be able to manage tens of thousands of cores, and dynamically create hundreds every day: https://issues.apache.org/jira/browse/SOLR-350? focusedCommentId=12571282#action_12571282 The issue of multicore configuration was left open

Strategy for presenting fresh data

2008-06-11 Thread James Brady
Hi, The product I'm working on requires new documents to be searchable very quickly (inside 60 seconds is my goal). The corpus is also going to grow very large, although it is perfectly partitionable by user. The approach I tried first was to have write-only masters and read- only slaves

Disk usage after document deletion

2009-01-25 Thread James Brady
Hi,I have a number of indices that are supposed to maintaining windows of indexed content - the last month's work of data, for example. At the moment, I'm cleaning out old documents with a simple cron job making requests like: deletequerydate_added:[* TO NOW-30DAYS]/query/delete I was expecting

Re: Disk usage after document deletion

2009-01-25 Thread James Brady
with the superset of all possible terms in the end. However, index size growth probably continues at roughly half the speed of it's growth during the filling up period. 2009/1/26 Ryan McKinley ryan...@gmail.com On Jan 25, 2009, at 6:06 PM, James Brady wrote: Hi,I have a number of indices that are supposed

Separate error logs

2009-01-30 Thread James Brady
Hi all,What's the best way for me to split Solr/Lucene error message off to a separate log? Thanks James

Re: Separate error logs

2009-01-30 Thread James Brady
Oh... I should really have found that myself :/ Thank you! 2009/1/30 Ryan McKinley ryan...@gmail.com check: http://wiki.apache.org/solr/SolrLogging You configure whatever flavor logger to write error to a separate log On Jan 30, 2009, at 4:36 PM, James Brady wrote: Hi all,What's

Re: Recent document boosting with dismax

2009-02-02 Thread James Brady
Hi, no the data_added field was one per document. 2009/2/1 Erik Hatcher e...@ehatchersolutions.com Is your date_added field multiValued and you've assigned multiple to some documents? Erik On Jan 31, 2009, at 4:12 PM, James Brady wrote: Hi,I'm following the recipe here: http

Re: Recent document boosting with dismax

2009-02-03 Thread James Brady
Great, thanks for that, Chris! 2009/2/3 Chris Hostetter hossman_luc...@fucit.org : Hi, no the data_added field was one per document. i would suggest adding multiValued=false to your date fieldType so that Solr can enforce that for you -- otherwise we can't be 100% sure. if it really is

Fwd: Separate error logs

2009-02-06 Thread James Brady
production usage? Thanks! James -- Forwarded message -- From: James Brady james.colin.br...@gmail.com Date: 2009/1/30 Subject: Re: Separate error logs To: solr-user@lucene.apache.org Oh... I should really have found that myself :/ Thank you! 2009/1/30 Ryan McKinley ryan

Persistent, seemingly unfixable corrupt indices

2009-02-22 Thread James Brady
Hi,My indices sometime become corrupted - normally when Solr has to be KILLed - these are not normally too much of a problem, as Lucene's CheckIndex tool can normally detect missing / broken segments and fix them. However, I now have a few indices throwing errors like this: INFO: [core4]

Re: Persistent, seemingly unfixable corrupt indices

2009-02-24 Thread James Brady
and across segments, so now I'm at a loss as to why it's not catching your case. Any of these indexes small enough to post somewhere i could access? Mike James Brady wrote: Hi,My indices sometime become corrupted - normally when Solr has to be KILLed - these are not normally too much

Last modified time for cores, taking into account uncommitted changes

2009-04-30 Thread James Brady
Hi, The lastModified field the Solr status seems to only be updated when a commit/optimize operation takes place. Is there any way to determine when a core has been changed, including any uncommitted add operations? Thanks, James

Truncated XML responses from CoreAdminHandler

2009-07-18 Thread James Brady
The Solr application I'm working on has many concurrently active cores - of the order of 1000s at a time. The management application depends on being able to query Solr for the current set of live cores, a requirement I've been satisfying using the STATUS core admin handler method. However, once

Re: Truncated XML responses from CoreAdminHandler

2009-07-31 Thread James Brady
On Sat, Jul 18, 2009 at 9:02 PM, James Brady james.colin.br...@gmail.com wrote: The Solr application I'm working on has many concurrently active cores - of the order of 1000s at a time. The management application depends on being able to query Solr for the current set of live cores

ClassCastException from custom request handler

2009-08-03 Thread James Brady
Hi, I'm creating a custom request handler to return a list of live cores in Solr. On startup, I get this exception for each core: Jul 31, 2009 5:20:39 PM org.apache.solr.common. SolrException log SEVERE: java.lang.ClassCastException: LiveCoresHandler at

Re: ClassCastException from custom request handler

2009-08-03 Thread James Brady
name - com.foo.path.to.LiveCoresHandler instead. Moreover, I am damn sure that you did not forget to drop your jar into solr.home/lib. Checking once again might not be a bad idea :) Cheers Avlesh On Mon, Aug 3, 2009 at 9:11 PM, James Brady james.colin.br...@gmail.com wrote: Hi, I'm

Re: ClassCastException from custom request handler

2009-08-04 Thread James Brady
have created? Cheers Avlesh On Mon, Aug 3, 2009 at 10:51 PM, James Brady james.colin.br...@gmail.com wrote: Hi, Thanks for your suggestions! I'm sure I have the class name right - changing it to something patently incorrect results in the expected

Re: ClassCastException from custom request handler

2009-08-04 Thread James Brady
, Aug 3, 2009 at 10:51 PM, James Brady james.colin.br...@gmail.com wrote: Hi, Thanks for your suggestions! I'm sure I have the class name right - changing it to something patently incorrect results in the expected org.apache.solr.common.SolrException: Error loading class

Re: ClassCastException from custom request handler

2009-08-04 Thread James Brady
ClassCastException. You are right about that, James. Which Solr version are you using? Can you please paste the relevant pieces in your solrconfig.xml and the request handler class you have created? Cheers Avlesh On Mon, Aug 3, 2009 at 10:51 PM, James Brady james.colin.br

Re: ClassCastException from custom request handler

2009-08-04 Thread James Brady
James! James Brady schrieb: There is *something* strange going on with classloaders; when I put my .class files in the right place in WEB-INF/lib in a repackaged solr.war file, it's not found by the plugin loader (Error loading class). So the plugin classloader isn't seeing stuff inside WEB

Re: ClassCastException from custom request handler

2009-08-04 Thread James Brady
Yeah I was thinking T would be SolrRequestHandler too. Eclipse's debugger can't tell me... Lot's of other handlers are created with no problem before my plugin falls over, so I don't think it's a problem with T not being what we expected. Do you know of any working examples of plugins I can

Re: ClassCastException from custom request handler

2009-08-05 Thread James Brady
Ackermann chantal.ackerm...@btelligent.de James Brady schrieb: Yeah I was thinking T would be SolrRequestHandler too. Eclipse's debugger can't tell me... You could try disassembling. Or Eclipse opens classes in a very rudimentary format when there is no source code attached. Maybe it shows