Out of memory in analysis

2008-03-06 Thread Benson Margulies
I pasted a modest blob of text into the analysis debug slot on the admin app, and am rewarded with this, even with -Xmx1g. java.lang.OutOfMemoryError: Java heap space at java.util.ArrayList.ensureCapacity(ArrayList.java:169) at java.util.ArrayList.add(ArrayList.java:351)

run-ketty-run

2008-03-06 Thread Benson Margulies
A helpful correspondent responded to a JIRA of mine suggesting run-jetty-run to debug solr plugins from eclipse. Can someone provide a bit more detail? Do you link your examples dir into your Eclipse project? Make a separate Eclipse project that points to the examples dir?

Re: run-ketty-run

2008-03-06 Thread Benson Margulies
Got it, thanks. On Thu, Mar 6, 2008 at 3:45 PM, Ryan McKinley [EMAIL PROTECTED] wrote: Benson Margulies wrote: A helpful correspondent responded to a JIRA of mine suggesting run-jetty-run to debug solr plugins from eclipse. Can someone provide a bit more detail? Do you link your

Re: Admin ping

2008-03-07 Thread Benson Margulies
Suggestion, another ant target that creates an example dir outside of the tree? I was a little bit surprised by the following scenario: 1) svn co 2) ant example 3) edit schema.xml 4) svn st In the future, I'll run cp -r before I start messing with the example. I'm +1 for the work directory.

Re: Out of memory in analysis

2008-03-12 Thread Benson Margulies
This turned out to be a side-effect of the since-fixed use of GET in analysis.jsp, coupled with a mistake in one of my filters. On Tue, Mar 11, 2008 at 8:31 PM, Chris Hostetter [EMAIL PROTECTED] wrote: : I pasted a modest blob of text into the analysis debug slot on the admin : app, and am

Re: Language support

2008-03-20 Thread Benson Margulies
Unless you can come up with language-neutral tokenization and stemming, you need to: a) know the language of each document. b) run a different analyzer depending on the language. c) force the user to tell you the language of the query. d) run the query through the same analyzer. On Thu, Mar

Re: Language support

2008-03-20 Thread Benson Margulies
You can store in one field if you manage to hide a language code with the text. XML is overkill but effective for this. At one point, we'd investigated how to allow a Lucene analyzer to see more than one field (the language code as well as the text) but I don't think we came up with anything. On

Re: Language support

2008-03-20 Thread Benson Margulies
or en/Laserjet. wunder On 3/20/08 9:20 AM, Benson Margulies [EMAIL PROTECTED] wrote: Unless you can come up with language-neutral tokenization and stemming, you need to: a) know the language of each document. b) run a different analyzer depending on the language. c) force the user

Re: Language support

2008-03-20 Thread Benson Margulies
and the benefit didn't justify that. wunder == Walter Underwood Former Ultraseek Architect Current Entire Netflix Search Department On 3/20/08 9:45 AM, Benson Margulies [EMAIL PROTECTED] wrote: Token/by/token seems a bit extreme. Are you concerned with macaronic documents? On Thu, Mar 20, 2008

Asking for help?

2008-05-23 Thread Benson Margulies
Dear CXF users, There are many versions of CXF sloshing around. We've got 2.0.6 and 2.1, and many people have picked up earlier versions. If the early universe underwent 'inflation,' CXF could perhaps be described as having experienced 'deflation', in the sense that we worked on and resolved

Rescoring queries

2012-03-21 Thread Benson Margulies
I confess that I did only minimal googling before composing the below. I would like to be able to insert my own code into Solr to rescore query results. To be more specific, I'd like to send Solr a query in which some additional information is attached to the query. After Solr has done the usual

RequestHandler versus SearchComponent

2012-03-22 Thread Benson Margulies
I'm looking at the following. I want to (1) map some query fields to some other query fields and add some things to FL, and then (2) rescore. I can see how to do it as a RequestHandler that makes a parser to get the fields, or I could see making a SearchComponent that was stuck into the list just

Re: reproducibility of query results

2012-04-01 Thread Benson Margulies
i make a new index each iteration. if I insert the same docs in the same order, should I expect the same query results? Note that I shut down entirely after the adds, then in a new process run the queries. On Apr 1, 2012, at 11:37 AM, Ahmet Arslan iori...@yahoo.com wrote: I appear to be

Re: reproducibility of query results

2012-04-01 Thread Benson Margulies
, if you cut off hits at some fixed threshold, you could see different entries at the low-scoring end of the hit list. - Steve THanks. -Original Message- From: Benson Margulies [mailto:bimargul...@gmail.com] Sent: Sunday, April 01, 2012 12:09 PM To: solr-user@lucene.apache.org

A little mild abuse of SearchHandler

2012-04-02 Thread Benson Margulies
I've got a prototype of a RequestHandler that embeds, within itself, a SearchHandler. Yes, I read the previous advice to be a query component, but I found it a lot easier to chart my course. I'm having some trouble with sorting. I came up with the following. 'args' is the usual MapString,

Re: A little mild abuse of SearchHandler

2012-04-02 Thread Benson Margulies
I've answered my own question, but it left me with a lot of curiosity. Why is the convention to build strings joined with commas (e.g in SolrQuery.addValueToParam) rather than to use the array option? All these params are MapString, String[], so why cram multiples into the first slot with commas

A curious request about a curious request handler

2012-04-03 Thread Benson Margulies
I've made a RequestHandler class that acts as follows: 1. At its initialization, it creates a StandardRequestHandler and hangs onto it. 2. When a query comes to it (I configure it to a custom qt value), it: a. creates a new query based on the query that arrived b. creates a

Re: A curious request about a curious request handler

2012-04-03 Thread Benson Margulies
On Tue, Apr 3, 2012 at 12:27 PM, Grant Ingersoll gsing...@apache.org wrote: On Apr 3, 2012, at 9:43 AM, Benson Margulies wrote: I've made a RequestHandler class that acts as follows: 1. At its initialization, it creates a StandardRequestHandler and hangs onto it. 2. When a query comes

Re: A curious request about a curious request handler

2012-04-03 Thread Benson Margulies
Grant, let me see if I can expand this, as it were: {!benson f1:v1 f2:v2 f3:v3} (or do I mean {!query defType='benson' ...}?) I see how that could expand to be anything else I like. However, the Function side has me a little more puzzled. The information from the fields inside my {! ... } gets

Cloud-aware request processing?

2012-04-09 Thread Benson Margulies
I'm working on a prototype of a scheme that uses SolrCloud to, in effect, distribute a computation by running it inside of a request processor. If there are N shards and M operations, I want each node to perform M/N operations. That, of course, implies that I know N. Is that fact available

'No JSP support' error in embedded Jetty for solrCloud as of apache-solr-4.0-2012-04-02_11-54-55

2012-04-09 Thread Benson Margulies
Starting the leader with: java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=rnicloud -DzkRun -DnumShards=3 -Djetty.port=9167 -jar start.jar and browsing to http://localhost:9167/solr/rnicloud/admin/zookeeper.jsp I get: HTTP ERROR 500 Problem accessing

Re: Cloud-aware request processing?

2012-04-09 Thread Benson Margulies
or another framework for distributed computation, see e.g. http://java.dzone.com/articles/comparison-gridcloud-computing -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 9. apr. 2012, at 13:41, Benson Margulies wrote: I'm

Is http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster up to date?

2012-04-09 Thread Benson Margulies
I specify -Dcollection.configName=rnicloud, but the admin gui tells me that I have a collection named 'collection1'. And, as reported in a prior email, the admin UI URL in there seems wrong.

Re: Re: Cloud-aware request processing?

2012-04-09 Thread Benson Margulies
-write the query for each shard? Seems unnecessary. For reasons described in previous email that I won't repeat here. brbrbr--- Original Message --- On 4/9/2012  08:45 AM Benson Margulies wrote:br Jan Høydahl, br brMy problem is intimately connected to Solr. it is not a batch job

Stumped on using a custom update request processor with SolrCloud

2012-04-09 Thread Benson Margulies
If you would be so kind as to look at https://issues.apache.org/jira/browse/SOLR-3342, you will see that I tried to use a working configuration for a URP of mine with SolrCloud, and received in return an NPE. Somehow or another, by default, the XmlUpdateRequestHandler ends up using (I think) the

SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Benson Margulies
Those of you insomniacs who have read my messages here over the last few weeks might recall that I've been working on a request handler that wraps the SearchHandler to rewrite queries and then reorder results. (I haven't quite worked out how to apply Grant's alternative suggestions without losing

Re: SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Benson Margulies
, Benson Margulies wrote: Those of you insomniacs who have read my messages here over the last few weeks might recall that I've been working on a request handler that wraps the SearchHandler to rewrite queries and then reorder results. (I haven't quite worked out how to apply Grant's alternative

Re: SolrCloud versus a SearchComponent that rescores

2012-04-09 Thread Benson Margulies
Um, maybe I've hit a quirk? In my solrconfig.xml, my special SearchComponents are installed only for a specific QT. So, it looks to me as if that QT is not propagated into the request out to the shards, and so they run the ordinary request handler without my components in it. Is this intended

Re: SolrCloud versus a SearchComponent that rescores

2012-04-10 Thread Benson Margulies
. Sent from my iPhone On Apr 9, 2012, at 9:26 PM, Benson Margulies bimargul...@gmail.com wrote: Um, maybe I've hit a quirk? In my solrconfig.xml, my special SearchComponents are installed only for a specific QT. So, it looks to me as if that QT is not propagated into the request out to the shards

Re: SolrCloud versus a SearchComponent that rescores

2012-04-10 Thread Benson Margulies
Another thought: currently I'm using qt=ME to indicate this process. I could, in theory, use some ME=true and make my components check for it to avoid this process, but it seems kind of peculiar from an end-user standpoint.

Re: SolrCloud versus a SearchComponent that rescores

2012-04-10 Thread Benson Margulies
I've updated the doc with my findings. Thanks for the pointer.

URP's versus Cloud

2012-04-10 Thread Benson Margulies
How are URP's managed with respect to cloud deployment? Given some solrconfig.xml like the below, do I expect it to be in the chain on the leader, the shards, or both? updateRequestProcessorChain name=RNI !-- some day, add parameters when we have some -- processor

Re: URP's versus Cloud

2012-04-10 Thread Benson Margulies
field. That seems to imply that 'before' processors run both on the leader and on the shards. Where do the afters run? Just on the leader or just on the shards? On Tue, 10 Apr 2012 12:43:36 -0400, Benson Margulies bimargul...@gmail.com wrote: How are URP's managed with respect to cloud

Default qt on SolrCloud

2012-04-10 Thread Benson Margulies
After I load documents into my cloud instance, a URL like: http://localhost:PORT/solr/query?q=*:* finds nothing. http://localhost:PORT/solr/query?q=*:*qt=standard finds everything. My custom request handlers have 'default=false'. What have I done?

I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-10 Thread Benson Margulies
In my cloud configuration, if I push delete query*:*/query /delete followed by: commit/ I get no errors, the log looks happy enough, but the documents remain in the index, visible to /query. Here's what seems my relevant bit of solrconfig.xml. My URP only implements processAdd.

Re: I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-11 Thread Benson Margulies
forward from there or back out your configs and plugins until it works again. On Tue, 2012-04-10 at 17:15 -0400, Benson Margulies wrote: In my cloud configuration, if I push delete   query*:*/query /delete followed by: commit/ I get no errors, the log looks happy enough, but the documents

Re: I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-11 Thread Benson Margulies
it works again. On Tue, 2012-04-10 at 17:15 -0400, Benson Margulies wrote: In my cloud configuration, if I push delete   query*:*/query /delete followed by: commit/ I get no errors, the log looks happy enough, but the documents remain in the index, visible to /query. Here's what

Re: Default qt on SolrCloud

2012-04-11 Thread Benson Margulies
presumably you've defined in solrconfig.xml... What does debugQuery=on show? It turned out that I had left an extra(eous) declaration for /query with my custom RT, and when I removed it all was well. thanks,benson Best Erick On Tue, Apr 10, 2012 at 12:31 PM, Benson Margulies bimargul

Re: I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-12 Thread Benson Margulies
to be configurable just like the uniqueKey in the schema. schema.xml You must have a _version_ field defined: field name=_version_ type=long indexed=true stored=true/ On Apr 11, 2012, at 9:10 AM, Benson Margulies wrote: I didn't have a _version_ field, since nothing in the schema says

Re: I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-12 Thread Benson Margulies
I'm probably confused, but it seems to me that the case I hit does not meet any of Yonik's criteria. I have no replicas. I'm running SolrCloud in the simple mode where each doc ends up in exactly one place. I think that it's just a bug that the code refuses to do the local deletion when there's

Re: I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-12 Thread Benson Margulies
On Thu, Apr 12, 2012 at 2:14 PM, Mark Miller markrmil...@gmail.com wrote: google must not have found it - i put that in a month or so ago I believe - at least weeks. As you can see, there is still a bit to fill in, but it covers the high level. I'd like to add example snippets for the rest

Realtime /get versus SearchHandler

2012-04-13 Thread Benson Margulies
A discussion over on the dev list led me to expect that the by-if field retrievals in a SolrCloud query would come through the get handler. In fact, I've seen them turn up in my search component in the search handler that is configured with my custom QT. (I have a 'prepare' method that sets

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-13 Thread Benson Margulies
On Fri, Apr 13, 2012 at 6:43 PM, John Chee johnc...@mylife.com wrote: On Fri, Apr 13, 2012 at 2:40 PM, Benson Margulies bimargul...@gmail.com wrote: Given a query including a subquery, is there any way for me to learn that subquery's contribution to the overall document score? I need

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-13 Thread Benson Margulies
On Fri, Apr 13, 2012 at 7:07 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Given a query including a subquery, is there any way for me to learn : that subquery's contribution to the overall document score? You have to just execute the subquery itself ... doc collection and score

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-14 Thread Benson Margulies
dig. Paul -- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté. Benson Margulies bimargul...@gmail.com a écrit : Given a query including a subquery, is there any way for me to learn that subquery's contribution to the overall document score? I can provide 'why on earth

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-14 Thread Benson Margulies
, should be pretty speedy for a mere 200 items. Maybe I'm missing some even easier way, given a DocList and a query, to obtain scores for those docs for that query? paul Le 14 avr. 2012 à 15:34, Benson Margulies a écrit : yes please On Apr 14, 2012, at 2:40 AM, Paul Libbrecht p...@hoplahup.net

Questions about the query function

2012-04-15 Thread Benson Margulies
I've been pestering you all with a series of questions about disassembling and partially rescoring queries. Every helpful response (thanks) has led me to further reading, and this leads to more questions. If I haven't before, I'll apologize now for the high level of ignorance at which I'm

Re: Questions about the query function

2012-04-15 Thread Benson Margulies
? Yup. _val_ would work too, or of course using that function as a parameter to (e)dismay's bf, or dismay's boost params.        Erik On Apr 15, 2012, at 08:43 , Benson Margulies wrote: I've been pestering you all with a series of questions about disassembling and partially rescoring

Re: Questions about the query function

2012-04-15 Thread Benson Margulies
Since I ended up with 'fund' instead of 'func' we're even. I made the edit. I'd make some more if you answered more of my questions :-) On Sun, Apr 15, 2012 at 9:42 AM, Erik Hatcher erik.hatc...@gmail.com wrote: _val_ would work too, or of course using that function as a parameter to

It's hard to google on _val_

2012-04-15 Thread Benson Margulies
So, I've been experimenting to learn how the _val_ participates in scores. It seems to me that http://wiki.apache.org/solr/FunctionQuery should explain the *effect* of including an _val_ term in an ordinary query, starting with a constant.

Re: It's hard to google on _val_

2012-04-15 Thread Benson Margulies
On Sun, Apr 15, 2012 at 12:14 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Sun, Apr 15, 2012 at 11:34 AM, Benson Margulies bimargul...@gmail.com wrote: So, I've been experimenting to learn how the _val_ participates in scores. It seems to me that http://wiki.apache.org/solr

Is there such as thing as FQ on a subquery?

2012-04-16 Thread Benson Margulies
I found myself wanting to write ... OR _query_:{!lucene fq=\a:b\}c:d And then I started looking at query trees in the debugger, and found myself thinking that there's no possible representation for this -- a subquery with a filter, since the filters are part of the RequestBuilder, not

Re: Query parsing VS marshalling/unmarshalling

2012-04-24 Thread Benson Margulies
2012/4/24 Mindaugas Žakšauskas min...@gmail.com: Hi, I maintain a distributed system which Solr is part of. The data which is kept is Solr is permissioned and permissions are currently implemented by taking the original user query, adding certain bits to it which would make it return less

Re: Unsubscribe does not appear to be working

2012-04-27 Thread Benson Margulies
There is no such thing as a 'solr forum' or a 'solr forum account.' If you are subscribed to this list, an email to the unsubscribe address will unsubscribe you. If some intermediary or third party is forwarding email from this list to you, no one here can help you. On Fri, Apr 27, 2012 at 12:09

Latest solr4 snapshot seems to be giving me a lot of unhappy logging about 'Log4j', should I be concerned?

2012-05-01 Thread Benson Margulies
CoreContainer.java, in the method 'load', finds itself calling loader.NewInstance with an 'fname' of Log4j of the slf4j backend is 'Log4j'. e.g.: 2012-05-01 10:40:32,367 org.apache.solr.core.CoreContainer - Unable to load LogWatcher org.apache.solr.common.SolrException: Error loading class

Re: Latest solr4 snapshot seems to be giving me a lot of unhappy logging about 'Log4j', should I be concerned?

2012-05-01 Thread Benson Margulies
logging as a by product. Don't remember the issue # offhand. I think there was a dispute about what should be done with it. On May 1, 2012, at 11:14 AM, Benson Margulies wrote: CoreContainer.java, in the method 'load', finds itself calling loader.NewInstance with an 'fname' of Log4j

Re: Latest solr4 snapshot seems to be giving me a lot of unhappy logging about 'Log4j', should I be concerned?

2012-05-01 Thread Benson Margulies
) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:304) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:101) On Tue, May 1, 2012 at 9:25 AM, Benson Margulies bimargul...@gmail.comwrote: On Tue, May 1, 2012 at 12:16 PM, Mark Miller markrmil

Why would solr norms come up different from Lucene norms?

2012-05-04 Thread Benson Margulies
So, I've got some code that stores the same documents in a Lucene 3.5.0 index and a Solr 3.5.0 instance. It's only five documents. For a particular field, the Solr norm is always 0.625, while the Lucene norm is .5. I've watched the code in NormsWriterPerField in both cases. In Solr we've got

Re: Why would solr norms come up different from Lucene norms?

2012-05-05 Thread Benson Margulies
. On Fri, May 4, 2012 at 6:30 AM, Benson Margulies bimargul...@gmail.com wrote: So, I've got some code that stores the same documents in a Lucene 3.5.0 index and a Solr 3.5.0 instance. It's only five documents. For a particular field, the Solr norm is always 0.625, while the Lucene norm is .5

Re: Updating fields in an existing document

2011-07-25 Thread Benson Margulies
As in http://wiki.apache.org/solr/UpdateXmlMessages? On Mon, Jul 25, 2011 at 4:10 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : A followup. The wiki has a whole discussion of the 'update' XML : message. But solrj has nothing like it. Does that really exist? Is : there a reason to use

Extensibility of compressed='true'

2010-12-24 Thread Benson Margulies
I'd like to have a field that transparently uses FastInfoset to store XML compactly. Ideally, I could supply the XML already in FIS format to solrj, but have application retrieve the field and get the XML 'reconstituted'. Obviously, I'm writing code here, but what? The field would be

Controlling webapp startup

2011-05-05 Thread Benson Margulies
There are two ways to characterize what I'd like to do. 1) use the EmbeddedSolrServer to launch Solr, and subsequently enable the HTTP GET/json servlet. I can provide the 'servlet' wiring, I just need to be able to hand an HttpServletRequest to something and retrieve in return the same json that

pagination and groups

2011-07-01 Thread Benson Margulies
I'm a bit puzzled while trying to adapt some pagination code in javascript to a grouped query. I'm using: 'group' : 'true', 'group.limit' : 5, // something to show ... 'group.field' : [ 'bt.nearDupCluster', 'bt.nearStoryCluster' ] and displaying each field's worth in a tab. how do I work

Re: pagination and groups

2011-07-01 Thread Benson Margulies
) you can show the total number of groups. group.limit tells Solr how many (max) documents you want to see for each group. On Fri, Jul 1, 2011 at 2:56 PM, Benson Margulies bimargul...@gmail.comwrote: I'm a bit puzzled while trying to adapt some pagination code in javascript to a grouped query

Re: pagination and groups

2011-07-01 Thread Benson Margulies
of groups that matched the query. NOTE: All this is in trunk, I'm not sure if it is on 3.3 On Fri, Jul 1, 2011 at 3:53 PM, Benson Margulies bimargul...@gmail.comwrote: What takes the place of response.response.numFound? 2011/7/1 Tomás Fernández Löbbe tomasflo...@gmail.com: I'm not sure

Re: pagination and groups

2011-07-02 Thread Benson Margulies
Hey, I don't suppose you could easily tell me the rev in which ngroups arrived? Also, how does ngroups compare to the 'matches' value inside each group? On Sat, Jul 2, 2011 at 3:06 PM, Yonik Seeley yo...@lucidimagination.com wrote: 2011/7/1 Tomás Fernández Löbbe tomasflo...@gmail.com: I'm

Nightly builds

2011-07-05 Thread Benson Margulies
The solr download link does not point to or mention nightly builds. Are they out there?

Re: Nightly builds

2011-07-05 Thread Benson Margulies
signature. On Tue, Jul 5, 2011 at 10:19 AM, Tom Gross itconse...@gmail.com wrote: On 07/05/2011 04:08 PM, Benson Margulies wrote: The solr download link does not point to or mention nightly builds. Are they out there? http://lmgtfy.com/?q=%2Bsolr+%2Bnightlybuildsl=1 -- Auther of the book

ClassCastException launching recent snapshot

2011-07-06 Thread Benson Margulies
Launching solr-4.0-20110705.223601-1.war, I get a class cast exception org.apache.lucene.index.DirectoryReader cannot be cast to org.apache.solr.search.SolrIndexReader with the following backtrace. I'm launching solr-as-a-webapp via an embedded copy of tomcat 7. The location of the index is set

Re: Nightly builds

2011-07-06 Thread Benson Margulies
On Wed, Jul 6, 2011 at 3:43 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : The reason for the email is not that I can't find them, but because : the project, I claim, should be advertising them more prominently on : the web site than buried in a wiki. : : Actually they are linked

Re: ClassCastException launching recent snapshot

2011-07-07 Thread Benson Margulies
erickerick...@gmail.com wrote: Then I would guess that you have other (older) jars in your classpath somewhere. Does the example Solr installation work? Best Erick On Wed, Jul 6, 2011 at 10:21 PM, Benson Margulies bimargul...@gmail.com wrote: Launching solr-4.0-20110705.223601-1.war, I get

Re: ClassCastException launching recent snapshot

2011-07-07 Thread Benson Margulies
classpath somewhere. Does the example Solr installation work? Best Erick On Wed, Jul 6, 2011 at 10:21 PM, Benson Margulies bimargul...@gmail.com wrote: Launching solr-4.0-20110705.223601-1.war, I get a class cast exception org.apache.lucene.index.DirectoryReader cannot be cast

Looking for big groups ...

2011-07-07 Thread Benson Margulies
I've got an index set up where there is a field that denotes membership in a document cluster. By using a grouped query, I can get a result grouped by cluster membership. Gosh, I wish I could add one more thing to the top of this pile: sort by group size. I'd like to have the ability demand sort

Updating fields in an existing document

2011-07-20 Thread Benson Margulies
We find ourselves in the following quandry: At initial index time, we store a value in a field, and we use it for facetting. So it, seemingly, has to be there as a field. However, from time to time, something happens that causes us to want to change this value. As far as we know, this requires

Re: Updating fields in an existing document

2011-07-21 Thread Benson Margulies
a single document ought not to be slow, although if you have many of them at once it could be, or if you end up needing to very frequently commit to an index it can indeed cause problems. From: Benson Margulies [bimargul...@gmail.com] Sent: Wednesday, July 20

Solr1.4 and threads ....

2012-06-13 Thread Benson Margulies
We've got a tokenizer which is quite explicitly coded on the assumption that it will only be called from one thread at a time. After all, what would it mean for two threads to make interleaved calls to the hasNext() function()? Yet, a customer of ours with a gigantic instance of Solr 1.4 reports

Re: Preparing the ground for a real multilang index

2009-07-07 Thread Benson Margulies
There is an alternative to knowing the language at query: multiply-process for stems or lemmas of all the possible languages. This may well be a cure much worse than the disease. Yes, LI can sell you our lemma-production capability. --benson margulies basis technology On Tue, Jul 7, 2009

Re: Lemmatisation support in Solr

2009-07-21 Thread Benson Margulies
There are for-money solutions to this. On Tue, Jul 21, 2009 at 10:04 AM, Grant Ingersollgsing...@apache.org wrote: Sounds like you need a TokenFilter that does lemmatisation.  I don't know of any open ones off hand, but I haven't looked all that hard. On Jul 21, 2009, at 4:25 AM, prerna07

A little discovery about the solr classpath and jetty

2009-09-22 Thread Benson Margulies
On (at least) two occasions, I've opened JIRAs due to my getting tangled up with eclipse, jetty, and solr/lib. Well, it occurs to me that a recent idea might be of general use to others in this regard. This fragment is offered for illustration. The idea here is that you can configure the jetty

A request handler that manipulated the index

2013-04-02 Thread Benson Margulies
I am thinking about trying to structure a problem as a Solr plugin. The nature of the plugin is that it would need to read and write the lucene index to do its work. It could not be cleanly split into URP 'over here' and a Search Component 'over there'. Are there invariants of Solr that would

wiki versus downloads versus archives

2013-05-16 Thread Benson Margulies
http://wiki.apache.org/solr/Solr3.1 claims that Solr3.1 is available in a place where it is not, and I can't find a link on the front page to the archive for old releases.

Re: wiki versus downloads versus archives

2013-05-16 Thread Benson Margulies
tanks. On Thu, May 16, 2013 at 4:28 PM, Shawn Heisey s...@elyograg.org wrote: On 5/16/2013 2:21 PM, Benson Margulies wrote: http://wiki.apache.org/solr/**Solr3.1http://wiki.apache.org/solr/Solr3.1claims that Solr3.1 is available in a place where it is not, and I can't find a link

solr.xml or its successor in the wiki

2013-05-19 Thread Benson Margulies
http://wiki.apache.org/solr/ConfiguringSolr does not point to any information on solr.xml. Given https://issues.apache.org/jira/browse/SOLR-4791, I'm a bit confused, and I need to set up a sharedLib directory for 4.3.0. I would do some writing or linking if I had some raw material ...

Re: solr.xml or its successor in the wiki

2013-05-19 Thread Benson Margulies
I found http://wiki.apache.org/solr/Solr.xml%204.3%20and%20beyond, but it doesn't mention the successor to sharedLib. On Sun, May 19, 2013 at 12:02 PM, Benson Margulies bimargul...@gmail.com wrote: http://wiki.apache.org/solr/ConfiguringSolr does not point to any information on solr.xml

Re: solr.xml or its successor in the wiki

2013-05-19 Thread Benson Margulies
OK, I found the successor. On Sun, May 19, 2013 at 12:40 PM, Benson Margulies bimargul...@gmail.com wrote: I found http://wiki.apache.org/solr/Solr.xml%204.3%20and%20beyond, but it doesn't mention the successor to sharedLib. On Sun, May 19, 2013 at 12:02 PM, Benson Margulies bimargul

Re: solr.xml or its successor in the wiki

2013-05-19 Thread Benson Margulies
on a fork between 4791 and this. On Sun, May 19, 2013 at 12:52 PM, Benson Margulies bimargul...@gmail.com wrote: OK, I found the successor. On Sun, May 19, 2013 at 12:40 PM, Benson Margulies bimargul...@gmail.com wrote: I found http://wiki.apache.org/solr/Solr.xml%204.3%20and%20beyond

Re: solr.xml or its successor in the wiki

2013-05-19 Thread Benson Margulies
Shawn, thanks. need any more jiras on this? On May 19, 2013, at 6:37 PM, Shawn Heisey s...@elyograg.org wrote: On 5/19/2013 11:27 AM, Benson Margulies wrote: Starting with the shipped solr.xml, I added a new-style str child to configure a shared lib, and i was rewarded with: Caused

Re: solr.xml or its successor in the wiki

2013-05-19 Thread Benson Margulies
One point of confusion: Is the compatibility code I hit trying to prohibit the 'str' form when it sees old-fangled cores? Or when the current running version pre-5.0? I hope it's the former. On Sun, May 19, 2013 at 6:47 PM, Shawn Heisey s...@elyograg.org wrote: On 5/19/2013 4:38 PM, Benson

Re: solr.xml or its successor in the wiki

2013-05-20 Thread Benson Margulies
it is completely correct, mind you) is that the presence of a cores tag defines which checks are performed. Errors are thrown on old-style constructs when no cores tag is present and vice-versa. Best Erick On Sun, May 19, 2013 at 7:20 PM, Benson Margulies bimargul...@gmail.com wrote: One point

Benchmarking Solr

2013-05-26 Thread Benson Margulies
I'd like to run a repeatable test of having Solr ingest a corpus of docs on disk, to measure the speed of some alternative things plugged in. Anyone have some advice to share? One approach would be a quick SolrJ program that pushed the entire stack as one giant collection with a commit at the

Not so concurrent concurrency

2013-05-28 Thread Benson Margulies
I can't quite apply SolrMeter to my problem, so I did something of my own. The brains of the operation are the function here. This feeds a ConcurrentUpdateSolrServer about 95 documents, each about 10mb, and 'threads' is six. Yet Solr just barely uses more than one core. private long

How can a Tokenizer be CoreAware?

2013-05-29 Thread Benson Margulies
I am currently testing some things with Solr 4.0.0. I tried to make a tokenizer CoreAware, and was rewarded with: Caused by: org.apache.solr.common.SolrException: Invalid 'Aware' object: com.basistech.rlp.solr.RLPTokenizerFactory@19336006 -- org.apache.solr.util.plugin.SolrCoreAware must be an

Seeming bug in ConcurrentUpdateSolrServer

2013-05-29 Thread Benson Margulies
The comment here is clearly wrong, since there is no division by two. I think that the code is wrong, because this results in not starting runners when it should start runners. Am I misanalyzing? if (runners.isEmpty() || (queue.remainingCapacity() queue.size() // queue // is //

Re: Seeming bug in ConcurrentUpdateSolrServer

2013-05-29 Thread Benson Margulies
Ah. So now I have to find some other explanation of why it never creates more than one thread, even when I make a very deep queue and specify 6 threads. On Wed, May 29, 2013 at 2:25 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, May 29, 2013 at 11:29 PM, Benson Margulies

Re: Seeming bug in ConcurrentUpdateSolrServer

2013-05-29 Thread Benson Margulies
. If the idea is that we want to pile up 'a lot' (1/2-of-a-q) of work before sending any of it, why start that first runner? On Wed, May 29, 2013 at 2:45 PM, Benson Margulies bimargul...@gmail.com wrote: Ah. So now I have to find some other explanation of why it never creates more than one

SOLR-4872 and LUCENE-2145 (or, how to clean up a Tokenizer)

2013-06-12 Thread Benson Margulies
Could I have some help on the combination of these two? Right now, it appears that I'm stuck with a finalizer to chase after native resources in a Tokenizer. Am I missing something?

Re: Solr Patent

2013-09-15 Thread Benson Margulies
I am not a lawyer. The Apache Software Foundation cannot 'protect Solr developers.' Patent infringement is a claim made against someone who derived economic benefit from an invention, not someone who writes code. The patent clause in the Apache License requires people who contribute code to

TokenizerFactory from 4.2.0 to 4.3.0

2013-09-16 Thread Benson Margulies
TokenizerFactory changed, incompatibly with subclasses, from 4.2.0 to 4.3.0. Subclasses must now implement a different overload of create, and may not implement the old one. Has anyone got any devious strategies other than multiple copies of code to deal with this when supporting multiple

Tracking down the input that hits an analysis chain bug

2014-01-03 Thread Benson Margulies
Using Solr Cloud with 4.3.1. We've got a problem with a tokenizer that manifests as calling OffsetAtt.setOffsets() with invalid inputs. OK, so, we want to figure out what input provokes our code into getting into this pickle. The problem happens on SolrCloud nodes. The problem manifests as this

Re: Tracking down the input that hits an analysis chain bug

2014-01-03 Thread Benson Margulies
at 1:56 PM, Benson Margulies ben...@basistech.com wrote: Using Solr Cloud with 4.3.1. We've got a problem with a tokenizer that manifests as calling OffsetAtt.setOffsets() with invalid inputs. OK, so, we want to figure out what input provokes our code into getting into this pickle. The problem

  1   2   >