Stable release, trunk release - same Tomcat instance

2009-06-12 Thread Jeff Rodenburg
If I want to run the stable 1.3 release and the nightly build under the same
Tomcat instance, should that be configured as multiple solr applications, or
is there a different configuration to follow?


Re: Stable release, trunk release - same Tomcat instance

2009-06-12 Thread Jeff Rodenburg
Um, yes this works.

On Fri, Jun 12, 2009 at 11:12 AM, Jeff Rodenburg
jeff.rodenb...@gmail.comwrote:

 If I want to run the stable 1.3 release and the nightly build under the
 same Tomcat instance, should that be configured as multiple solr
 applications, or is there a different configuration to follow?



Re: Getting SolrSharp to work, Part 2

2008-01-25 Thread Jeff Rodenburg
Great, thanks Peter.  And yes, I think it would be good to concentrate the
conversation over on codeplex.  I know the Solr team has no problem with
solrsharp conversations here on the solr mailing list, but the conversation
is highly focused on the server.  Putting the solrsharp conversation on
codeplex would keep the messages from drowning on this list.

I'll check out the patch when I get a chance, thanks for the contribution.
Hope things are working better for you now.  :-/

-- j

On Jan 25, 2008 4:53 AM, Peter Thygesen [EMAIL PROTECTED] wrote:

 Ups. Forgot to tell that the patch was uploaded on CodePlex
 http://www.codeplex.com/solrsharp/SourceControl/PatchList.aspx

 \peter


 -Original Message-
 From: Peter Thygesen
 Sent: 25. januar 2008 13:17
 To: solr-user@lucene.apache.org
 Subject: RE: Getting SolrSharp to work, Part 2

 This patch covers the issues I wrote about in my previous mails How to
 get SolrSharp to work and How to get SolrSharp to work, part 2

 By the way should I post on this thread, or on CodePlex. When the topic
 is SolrSharp? I don't mind adding a few more comments to the
 discussion I already started on CodePlex.

 \peter

 -Original Message-
 From: Jeff Rodenburg [mailto:[EMAIL PROTECTED]
 Sent: 24. januar 2008 20:59
 To: solr-user@lucene.apache.org
 Subject: Re: Getting SolrSharp to work, Part 2

 Hey Peter - if you could submit your changes as an svn patch, we could
 apply
 the update much faster.

 thanks,
 jeff



 On Jan 23, 2008 2:42 AM, Peter Thygesen [EMAIL PROTECTED] wrote:

  I wrote a small client in .Net which query Solr and dumps the result
 on
  screen.. fantastic low-tech.. ;)
 
  However I ran into new SolrSharp problems. My schema allows a
 particular
  field to be multiValued, but if it only has one value, it will cause
  SolrSharp fail in line 88 of Class: IndexFiledAttribute.
 
  My SearchRecord property is an array (List) and line 88 tries to set
 my
  property as if it was a string. The code should be corrected by
 checking
  if the property is an array and not whether it has 1 value or more.
  E.g. change line 85 to 085
 if(!this.PropertyInfo.PropertyType.IsArray)
 
  Original code (from class IndexFiledAttribute):
  082 public void SetValue(SearchRecord searchRecord)
  083 {
  084   XmlNodeList xnlvalues =
  searchRecord.XNodeRecord.SelectNodes(this.XnodeExpression);
  085   if (xnlvalues.Count == 1)   //single value
  086   {
  087 XmlNode xnodevalue = xnlvalues[0];
  088 this.PropertyInfo.SetValue(searchRecord,
  Convert.ChangeType(xnodevalue.InnerText,
 this.PropertyInfo.PropertyType)
  , null);
  089   }
  090   else if (xnlvalues.Count  1)   //array
  091   {
  092 Type basetype =
  this.PropertyInfo.PropertyType.GetElementType();
  093 Array valueArray = Array.CreateInstance(basetype,
  xnlvalues.Count);
  094 for (int i = 0; i  xnlvalues.Count; i++)
  095 {
  096
  valueArray.SetValue(Convert.ChangeType(xnlvalues[i].InnerText,
  basetype), i);
  097 }
  098 this.PropertyInfo.SetValue(searchRecord, valueArray, null);
  099   }
  100 }
 
  My code (replace):
  085if(!this.PropertyInfo.PropertyType.IsArray) // single value
  090else // array
 
  Cheers,
  Peter Thygesen
 
  -- hope to see you all at ApacheCon in Amsterdam :)
 
 
 







Re: Updating and Appending

2008-01-24 Thread Jeff Rodenburg
On Jan 23, 2008 1:29 PM, Chris Harris [EMAIL PROTECTED] wrote:


  And then if you're using
 a client such as solrsharp, there's the question of whether *it* will
 slurp the whole stream into memory.


Solrsharp reads of the XML stream from Solr use standard dotnet framework
XML objects, which by default read the entirety of the stream into memory
before returning control back to your code.  There are facilities in the
dotnet framework which provide for reading XML data in chunks vs. the full
stream, but solrsharp at present uses the defaults of the framework.

-- jeff


Re: Solr, operating systems and globalization

2007-10-18 Thread Jeff Rodenburg
OK, this simplifies things greatly.  For C#, the proper culture setting for
interaction with Solr should be Invariant.

Basically, the primary requirement for Solrsharp is to be
culturally-consistent with the targeted Solr server to ensure proper
data-type formatting.  Since Solr is culturally-agnostic, Solrsharp should
be so as well.

Thanks for the clarification.

On 10/17/07, Chris Hostetter [EMAIL PROTECTED] wrote:


 : This is exactly the scenario.  Ideally what I'd like to achieve is for
 : Solrsharp to discover the culture settings from the targeted Solr
 instance
 : and set the client in appropriate position.

 well ... my point is there shouldn't be any cultural settings on the
 targeted Solr server that the client needs to know about.

 the communication between the server and any clients should always be in a
 fixed format independent of culture.  Any (hypothetical) culture specific
 settings the server has to have might affect teh functionality, but
 shouldn't affect the communication (ie: for the purposes of date
 rounding/faceting the Solr server might be configured to know what
 timezone to use for rounding to the nearest day is, or what Locale to use
 to compute the first first day of the week, but when returning that info
 to clients it should still be stringified in an abolute format (UTC)



: multi-lingual systems across different JVM and OS platforms.  If it *were*
 : the case that different underlying system stacks affected solr in such a
 : way, Solrsharp should follow the server's lead.

 if that were the case, the server would be buggy and should be fixed :)

 i don't know much about C#, but i can't really think of a lot of cases
 where client APIs really need to be very multi-cultural aware ...
 typically culture/locale type settings related to parsing and formatting
 of datatypes (ie: how to stringify a number, how to convert a date to/from
 a string, etc...).  when client code is taking input and sending it to
 solr it's dealing with native objects nad stringifying them into the
 canonical format Solr wants -- independent of culture.  when client code
 is reading data back from Solr and returning it it needs to parse those
 strings from the canonical form and return them as native objects.

 The only culture that SolrSharp should need to worry about is the
 InvariantCulture you described ... right?



 -Hoss




Re: Solr, operating systems and globalization

2007-10-17 Thread Jeff Rodenburg
Thanks for the comments Hoss.  More notes embedded below...

On 10/17/07, Chris Hostetter [EMAIL PROTECTED] wrote:


 : However, SolrSharp culture settings should be reflective and consistent
 with
 : the solr server instance's culture.  This leads to my question: does
 Solr
 : control its culture  language settings through the various language
 : components that can be incorporated, or does the underlying OS have a
 say in
 : how that data is treated?

 As a general rule:
   1) Solr (the server) should operate as culturally and locally agnostic
 as possible.
   2) Solr Clients that want to act culturally appropriate should
  explicitly translate from local formats to absolute concepts that
  it sends to the server.  (ala: the absolute unambiguous date format)

 Ideally you should be able to take a Solr install from one box, move it to
 another JVM on a different OS in a different timezone with different
 Locale settings and everything will keep working the same.


I fully understand that approach.  Going back to C#/Windows, this is known
as an Invariant culture setting, which we're incorporating into Solrsharp
(along with configurable culture settings as appropriate.)

(I think once upon a time i argued that Solr should assume the
 charencoding of the local JVM, and wiser people then me pointed out that
 was bad).

 There may be exceptions to this -- but those exceptions should be in cases
 where: a) the person configuring Solr is in completley control; and b) the
 exception is prudent because doing the work in the client would require
 more complexity.  Analysis is a good example of this: we don't make the
 clients analyze the text according to the native language customs -- we
 let the person creating the schema.xml specify what the Analysis should
 be.

 As i recal, the issue that prompted this email had to do with C# and the
 various cultural ways to specify a floating point number: 1,234 vs 1.234
 (comma vs period).  this is the kind of thing that should be translated in
 clients to the canonical floating point representation. ... by which i
 mean: the one the solr server uses :)


This is exactly the scenario.  Ideally what I'd like to achieve is for
Solrsharp to discover the culture settings from the targeted Solr instance
and set the client in appropriate position.

*IF* Solr has the behavior where setting the JVM local to something random
 makes Solr assume floats should be in the comma format, then i would
 consider that a Bug in Solr ... Solr should allways be consistent.


This would be an interesting discovery exercise for those who deal with
multi-lingual systems across different JVM and OS platforms.  If it *were*
the case that different underlying system stacks affected solr in such a
way, Solrsharp should follow the server's lead.

-Hoss




Solr, operating systems and globalization

2007-10-12 Thread Jeff Rodenburg
We discovered and verified an issue in SolrSharp whereby indexing and
searching can be disrupted without taking Windows globalization  culture
settings into consideration.  For example, European cultures affect numeric
and date values differently from US/English cultures.  The resolution for
this type of issue is to specifically control the culture settings to allow
for index data formatting to work.

However, SolrSharp culture settings should be reflective and consistent with
the solr server instance's culture.  This leads to my question: does Solr
control its culture  language settings through the various language
components that can be incorporated, or does the underlying OS have a say in
how that data is treated?

Some education on this would be greatly appreciated.

cheers,
jeff r.


Re: WebException (ServerProtocolViolation) with SolrSharp

2007-10-11 Thread Jeff Rodenburg
Good to know, I think this needs to be a configurable value in the library
(overridable, at a minimum.)

What's outstanding for me on this is understanding the Solr side of the
equation, and whether culture variance comes into play.  What makes this
even more interesting/confusing is how culture scenarios may differ across
platforms.  I do most of my production work against a solr farm running on
RHEL4, but often do side development work against Win2K3.

Thanks for confirming the culture issue, this will make its way into the
source as a fix in the future.

cheers,
jeff



On 10/11/07, Filipe Correia [EMAIL PROTECTED] wrote:

 Jeff,

 Thanks! Your suggestion worked, instead of invoking ToString() on
 float values I've used ToString's other signature, which takes a an
 IFormatProvider:

 CultureInfo MyCulture = CultureInfo.InvariantCulture;
 this.Add(new IndexFieldValue(weight,
 weight.ToString(MyCulture.NumberFormat)));
 this.Add(new IndexFieldValue(price, price.ToString(
 MyCulture.NumberFormat)));

 This made me think on a related issue though. In this case it was the
 client that was using a non-invariant number format, but can this also
 happen on Solr's side? If so, I guess I may need to configure it
 somewhere...

 Cheers,
 Filipe Correia

 On 10/10/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
  Hi Felipe -
 
  The issue you're encountering is a problem with the data format being
 passed
  to the solr server.  If you follow the stack trace that you posted,
 you'll
  notice that the solr field is looking for a value that's a float, but
 the
  passed value is 1,234.
 
  I'm guessing this is caused by one of two possibilities:
 
  (1) there's a typo in your example code, where 1,234 should actually
 be 
  1.234, or
  (2) there's a culture settings difference on your server that's
 converting 
  1.234 to 1,234
 
  Assuming it's the latter, add this line in the ExampleIndexDocument
  constructor:
 
  CultureInfo MyCulture = new CultureInfo(en-US);
 
  Please let me know if this fixes the issue, I've been looking at this
  previously and would like to confirm it.
 
  thanks,
  jeff r.
 
 
  On 10/10/07, Filipe Correia [EMAIL PROTECTED] wrote:
  
   Hello,
  
   I am trying to run SolrSharp's example application but am getting a
   WebException with a ServerProtocolViolation status message.
  
   After some debugging I found out this is happening with a call to:
   http://localhost:8080/solr/update/
  
   And using fiddler[1] found out that solr is actually throwing the
   following exception:
   org.apache.solr.core.SolrException: Error while creating field
  
 'weight{type=sfloat,properties=indexed,stored,omitNorms,sortMissingLast}'
   from value '1,234'
   at org.apache.solr.schema.FieldType.createField(FieldType.java
   :173)
   at org.apache.solr.schema.SchemaField.createField(
 SchemaField.java
   :94)
   at org.apache.solr.update.DocumentBuilder.addSingleField(
   DocumentBuilder.java:57)
   at org.apache.solr.update.DocumentBuilder.addField(
   DocumentBuilder.java:73)
   at org.apache.solr.update.DocumentBuilder.addField(
   DocumentBuilder.java:83)
   at org.apache.solr.update.DocumentBuilder.addField(
   DocumentBuilder.java:77)
   at org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(
   XmlUpdateRequestHandler.java:339)
   at org.apache.solr.handler.XmlUpdateRequestHandler.update(
   XmlUpdateRequestHandler.java:162)
   at
   org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(
   XmlUpdateRequestHandler.java:84)
   at org.apache.solr.handler.RequestHandlerBase.handleRequest(
   RequestHandlerBase.java:77)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
   at org.apache.solr.servlet.SolrDispatchFilter.execute(
   SolrDispatchFilter.java:191)
   at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
   SolrDispatchFilter.java:159)
   at
   org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
   ApplicationFilterChain.java:235)
   at org.apache.catalina.core.ApplicationFilterChain.doFilter(
   ApplicationFilterChain.java:206)
   at org.apache.catalina.core.StandardWrapperValve.invoke(
   StandardWrapperValve.java:233)
   at org.apache.catalina.core.StandardContextValve.invoke(
   StandardContextValve.java:175)
   at org.apache.catalina.core.StandardHostValve.invoke(
   StandardHostValve.java:128)
   at org.apache.catalina.valves.ErrorReportValve.invoke(
   ErrorReportValve.java:102)
   at org.apache.catalina.core.StandardEngineValve.invoke(
   StandardEngineValve.java:109)
   at org.apache.catalina.connector.CoyoteAdapter.service(
   CoyoteAdapter.java:263)
   at org.apache.coyote.http11.Http11Processor.process(
   Http11Processor.java:844)
   at
  
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(
   Http11Protocol.java:584

Re: WebException (ServerProtocolViolation) with SolrSharp

2007-10-10 Thread Jeff Rodenburg
Hi Felipe -

The issue you're encountering is a problem with the data format being passed
to the solr server.  If you follow the stack trace that you posted, you'll
notice that the solr field is looking for a value that's a float, but the
passed value is 1,234.

I'm guessing this is caused by one of two possibilities:

(1) there's a typo in your example code, where 1,234 should actually be 
1.234, or
(2) there's a culture settings difference on your server that's converting 
1.234 to 1,234

Assuming it's the latter, add this line in the ExampleIndexDocument
constructor:

CultureInfo MyCulture = new CultureInfo(en-US);

Please let me know if this fixes the issue, I've been looking at this
previously and would like to confirm it.

thanks,
jeff r.


On 10/10/07, Filipe Correia [EMAIL PROTECTED] wrote:

 Hello,

 I am trying to run SolrSharp's example application but am getting a
 WebException with a ServerProtocolViolation status message.

 After some debugging I found out this is happening with a call to:
 http://localhost:8080/solr/update/

 And using fiddler[1] found out that solr is actually throwing the
 following exception:
 org.apache.solr.core.SolrException: Error while creating field
 'weight{type=sfloat,properties=indexed,stored,omitNorms,sortMissingLast}'
 from value '1,234'
 at org.apache.solr.schema.FieldType.createField(FieldType.java
 :173)
 at org.apache.solr.schema.SchemaField.createField(SchemaField.java
 :94)
 at org.apache.solr.update.DocumentBuilder.addSingleField(
 DocumentBuilder.java:57)
 at org.apache.solr.update.DocumentBuilder.addField(
 DocumentBuilder.java:73)
 at org.apache.solr.update.DocumentBuilder.addField(
 DocumentBuilder.java:83)
 at org.apache.solr.update.DocumentBuilder.addField(
 DocumentBuilder.java:77)
 at org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(
 XmlUpdateRequestHandler.java:339)
 at org.apache.solr.handler.XmlUpdateRequestHandler.update(
 XmlUpdateRequestHandler.java:162)
 at
 org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(
 XmlUpdateRequestHandler.java:84)
 at org.apache.solr.handler.RequestHandlerBase.handleRequest(
 RequestHandlerBase.java:77)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:658)
 at org.apache.solr.servlet.SolrDispatchFilter.execute(
 SolrDispatchFilter.java:191)
 at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
 SolrDispatchFilter.java:159)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
 ApplicationFilterChain.java:235)
 at org.apache.catalina.core.ApplicationFilterChain.doFilter(
 ApplicationFilterChain.java:206)
 at org.apache.catalina.core.StandardWrapperValve.invoke(
 StandardWrapperValve.java:233)
 at org.apache.catalina.core.StandardContextValve.invoke(
 StandardContextValve.java:175)
 at org.apache.catalina.core.StandardHostValve.invoke(
 StandardHostValve.java:128)
 at org.apache.catalina.valves.ErrorReportValve.invoke(
 ErrorReportValve.java:102)
 at org.apache.catalina.core.StandardEngineValve.invoke(
 StandardEngineValve.java:109)
 at org.apache.catalina.connector.CoyoteAdapter.service(
 CoyoteAdapter.java:263)
 at org.apache.coyote.http11.Http11Processor.process(
 Http11Processor.java:844)
 at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(
 Http11Protocol.java:584)
 at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(
 JIoEndpoint.java:447)
 at java.lang.Thread.run(Unknown Source)
 Caused by: java.lang.NumberFormatException: For input string:
 quot;1,234quot;
 at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
 at java.lang.Float.parseFloat(Unknown Source)
 at org.apache.solr.util.NumberUtils.float2sortableStr(
 NumberUtils.java:80)
 at org.apache.solr.schema.SortableFloatField.toInternal(
 SortableFloatField.java:50)
 at org.apache.solr.schema.FieldType.createField(FieldType.java
 :171)
 ... 24 more
 type Status report
 message Error while creating field
 'weight{type=sfloat,properties=indexed,stored,omitNorms,sortMissingLast}'
 from value '1,234'

 I am just starting to try Solr, and might be missing some
 configurations, but I have no clue where to begin to investigate this
 further without digging into Solr's source, which I would really like
 to avoid for now. Any thoughts?

 thank you in advance,
 Filipe Correia

 [1] http://www.fiddlertool.com/



Re: Solrsharp culture problems

2007-09-24 Thread Jeff Rodenburg
Yes, that would be the right solution.  I'm not sure if, in order to use
french culture settings on xp, you would require corresponding changes in
culture settings for the solr instance.

Hope this helps.

-- j

On 9/24/07, JP Genty - LibertySurf [EMAIL PROTECTED] wrote:


 I use solrsharp on a french XP and I have problems with the float
 conversion to text.

 I modified ExempleIndexDocument constructor to force the en-US
 culture.

 CultureInfo MyCulture = new CultureInfo(en-US);
 .
 .
 this.Add(new IndexFieldValue(weight, weight.ToString(MyCulture)));
 this.Add(new IndexFieldValue(price, price.ToString(MyCulture)));

And I modified IndexFieldAttribute SetValue method

CultureInfo MyCulture = new CultureInfo(en-US);

this.PropertyInfo.SetValue(searchRecord,
 Convert.ChangeType(xnodevalue.InnerText,
 this.PropertyInfo.PropertyType, MyCulture), null);


 valueArray.SetValue(Convert.ChangeType(xnlvalues[i].InnerText, basetype,
 MyCulture), i);


   Now the example runs smoothly on a windows XP french.


  Is it the right solution ??

Thanks
 Jean-Paul







Dilbert (off-topic)

2007-09-07 Thread Jeff Rodenburg
It may be off-topic, but it's friday and thought all the java coders would
appreciate today's dilbert.  (I'm not primary a java dev, but I know the
feeling)

http://www.dilbert.com/comics/dilbert/archive/dilbert-20070907.html

cheers,
jeff r.


Solrsharp now supports debugQuery

2007-08-31 Thread Jeff Rodenburg
Solrsharp now supports query debugging.  This is enabled through the
debugQuery and explainOther parameters.

A DebugResults object is referenced by a SearchResults instance and provides
all the debugging information that is available through these parameters,
such as:

   - QueryString and ParsedQuery string values
   - Array of ExplanationRecord objects
   - OtherQuery value (if provided)
   - Array of ExplanationRecord objects supporting the OtherQuery value

The ExplanationRecord object provides the details of the debug results,
specifically including the ExplainInfo string (the debug analysis payload)
and a reference to the UniqueRecordKey of the evaluated record.  The
UniqueRecordKey, though returned as a string, could then be cast
appropriately to reference the matching SearchRecord referenced by the same
SearchResults instance.

The example program with the source code has been updated to show how to
make use of these properties.  If any issues are found, please log them to
JIRA and associate them with the C# client component.

cheers,
jeff r.


Major update to Solrsharp

2007-08-22 Thread Jeff Rodenburg
A big update was just posted to the Solrsharp project.  This update now
provides for first-class support for highlighting in the library.

The implementation is really robust and provides the following features:

   - Structured highlight parameter assignment based on the SolrField
   object
   - Full access for all highlight parameters, on both an aggregate and
   per-field basis
   - Incorporation of highlighted values into the base search result
   records

All of the supplied documentation has been updated as well as the example
application in using the highlighting classes.

Please report any issues through JIRA.  Be sure to associate any issues with
the C# client component.

cheers,
jeff r.


Re: Solrsharp highlighting

2007-08-15 Thread Jeff Rodenburg
I've been working on the highlighting component, and it's a little odd how
it works.  For myself, if I want terms highlighted, I'd like those in the
return results.  Solr, on the other hand, returns a separate xml node that
represents the portions of the results that are highlighted.  I know that
it's incorporated that way for other reasons, but it makes patching the
highlighted portions together with the doc results in Solrsharp an
out-of-band experience.

Nonetheless, the approach I'm trying is one where the highlighted nodes are
associated with the SearchResults object, and will have their highlighted
text bits incorporated into the associated SearchRecord objects.

At least that's what I'm initially trying to accomplish.

-- j

On 8/15/07, Charlie Jackson [EMAIL PROTECTED] wrote:

 Thanks for adding in those facet examples. That should help me out a
 great deal.

 As for the highlighting, did you have any ideas about a good way to go
 about it? I was thinking about taking a stab at it, but I want to get
 your input first.


 Thanks,
 Charlie


 -Original Message-
 From: Jeff Rodenburg [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, August 14, 2007 1:08 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Solrsharp highlighting

 Pull down the latest example code from
 http://solrstuff.org/svn/solrsharpwhich includes adding facets to
 search results.  It's really short and
 simple to add facets; the example application implements one form of it.
 The nice thing about the facet support is that it utilizes generics to
 allow
 you to have strongly typed name/value pairs for the fieldname/count
 data.

 Hope this helps.

 -- jeff r.

 On 8/10/07, Charlie Jackson [EMAIL PROTECTED] wrote:
 
  Also, are there any examples out there of how to use Solrsharp's
  faceting capabilities?
 
  
  Charlie Jackson
  312-873-6537
  [EMAIL PROTECTED]
  -Original Message-
  From: Charlie Jackson [mailto:[EMAIL PROTECTED]
  Sent: Friday, August 10, 2007 3:51 PM
  To: solr-user@lucene.apache.org
  Subject: Solrsharp highlighting
 
  Trying to use Solrsharp (which is a great tool, BTW) to get some
 results
  in a C# application. I see the HighlightFields method of the
  QueryBuilder object and I've set it to my highlight field, but how do
 I
  get at the results? I don't see anything in the SearchResults code
 that
  does anything with the highlight results XML. Did I miss something?
 
 
 
 
 
  Thanks,
 
  Charlie
 
 



Re: Solrsharp highlighting

2007-08-14 Thread Jeff Rodenburg
Pull down the latest example code from
http://solrstuff.org/svn/solrsharpwhich includes adding facets to
search results.  It's really short and
simple to add facets; the example application implements one form of it.
The nice thing about the facet support is that it utilizes generics to allow
you to have strongly typed name/value pairs for the fieldname/count data.

Hope this helps.

-- jeff r.

On 8/10/07, Charlie Jackson [EMAIL PROTECTED] wrote:

 Also, are there any examples out there of how to use Solrsharp's
 faceting capabilities?

 
 Charlie Jackson
 312-873-6537
 [EMAIL PROTECTED]
 -Original Message-
 From: Charlie Jackson [mailto:[EMAIL PROTECTED]
 Sent: Friday, August 10, 2007 3:51 PM
 To: solr-user@lucene.apache.org
 Subject: Solrsharp highlighting

 Trying to use Solrsharp (which is a great tool, BTW) to get some results
 in a C# application. I see the HighlightFields method of the
 QueryBuilder object and I've set it to my highlight field, but how do I
 get at the results? I don't see anything in the SearchResults code that
 does anything with the highlight results XML. Did I miss something?





 Thanks,

 Charlie




Re: Solrsharp highlighting

2007-08-13 Thread Jeff Rodenburg
Thanks for the comments, Charlie.

No, you didn't miss anything with the highlight results.  It hasn't been
implemented yet.  :-/
The first implementation was quite janky, and was consequently removed.  I'm
adding an issue in JIRA about implementing highlighted fields.  (
https://issues.apache.org/jira/browse/SOLR-338)



On 8/10/07, Charlie Jackson [EMAIL PROTECTED] wrote:

 Trying to use Solrsharp (which is a great tool, BTW) to get some results
 in a C# application. I see the HighlightFields method of the
 QueryBuilder object and I've set it to my highlight field, but how do I
 get at the results? I don't see anything in the SearchResults code that
 does anything with the highlight results XML. Did I miss something?





 Thanks,

 Charlie




Re: Please help! Solr 1.1 HTTP server stops responding

2007-07-31 Thread Jeff Rodenburg
Not sure if this would help you, but we encountered java heap OOM issues
with 1.1 earlier this year.  We patched solr with the latest bits at the
time, which included a lucene memory fix for java heap OOM issues.  (
http://issues.apache.org/jira/browse/LUCENE-754)

Different servlet container (Tomcat 5.5) and we're running JRE 5 v9.

After applying the update to the solr bits that included the patch mentioned
above, OOM has never re-appeared.

-- j

On 7/30/07, Mike Klaas [EMAIL PROTECTED] wrote:

 On 30-Jul-07, at 11:35 AM, David Whalen wrote:

  Hi Yonik!
 
  I'm glad to finally get to talk to you.  We're all very impressed
  with solr and when it's running it's really great.
 
  We increased the heap size to 1500M and that didn't seem to help.
  In fact, the crashes seem to occur more now than ever.  We're
  constantly restarting solr just to get a response.

 How much memory is on the system, and is anything else running?  How
 large is the resulting index?

 If you're willing for some queries to take longer after a commit,
 reducing/eliminating the autoWarmCount for your queryCache and
 facetCache should decrease the peak memory usage (as Solr as two
 copies of the cache open at that point).  Setting it to zero could up
 the halve the peak memory usage (at the cost of loss of performance
 after commits).

 As yonik suggested, check for PERFORMANCE warnings too--you may have
 more than two Searchers open at once.

 -Mike





Acceptable schema def?

2007-07-23 Thread Jeff Rodenburg

As an example, consider the following:

dynamicField name=*_field type=text_ws indexed=true stored=true/
copyField source=yadayada_field dest=all_fields /
field name=all_fields type=text_ws indexed=true stored=false/

Two questions:
1) Is the definition of the source attribute for a copyField node that would
work as a dynamicField node valid?
2) Is the dest attribute for a copyField node required to be implemented as
a field node?  Could it be a dynamic field?  For example, could the dest
attribute in the above example be set to mega_field (since that would
match the dynamicField definition)?

I'll test these myself laer, but don't have access to a solr instance to
play with this stuff right now.

thanks,
j


Re: Acceptable schema def?

2007-07-23 Thread Jeff Rodenburg

As an example, consider the following:

dynamicField name=*_field type=text_ws indexed=true stored=true/

copyField source=yadayada_field dest=all_fields /
field name=all_fields type=text_ws indexed=true stored=false/

Two questions:
1) Is the definition of the source attribute for a copyField node that
would work as a dynamicField node valid?
2) Is the dest attribute for a copyField node required to be implemented
as a field node?  Could it be a dynamic field?  For example, could the
dest attribute in the above example be set to mega_field (since that would
match the dynamicField definition)?

I'll test these myself later, but don't have access to a solr instance to
play with this stuff right now.



Another funky example to ponder:

dynamicField name=*_field type=string indexed=true stored=true/
copyField source=*_field dest=all_fields /
field name=all_fields type=text_ws indexed=true stored=false/


Solrsharp: direction

2007-07-08 Thread Jeff Rodenburg

I've been asked a few questions of late that all have a familiar theme:
what's going on with solrsharp development?  Well, I've been working on the
next iteration of the Solrsharp client library, attempting to bring it more
in line with the capabilities of Solr, at least as of the 1.2 release.  The
goal of the Solrsharp project is to enable C# applications to take full
advantage of Solr.

Here's what happening: the main feature in development right now is the
creation of RequestHandler objects.  Solrsharp uses default handlers for
queries and updates (/select and /update); the RequestHandler objects will
enable assignable solr requesthandlers to any query.  While assigning a
request handler for a specific query is an active step, loading the
solr-configured request handlers will be passive.  The default handlers will
still apply, in case you don't require any different handlers.  If anyone
has suggestions or comments around this, please pass them along.

Ideally, we would begin thinking about Solr 1.3 features and how Solrsharp
would be extended to utilize those as well.  Any comments about future
capabilities and what clients need to do to take advantage of those are
welcome.

cheers,
jeff r.


Re: SolrSharp boost - int vs. float

2007-07-05 Thread Jeff Rodenburg

Nope, other than just oversight.

I just modified the QueryParameter class to change the _boost and Boost
variable  property to type float, and all works well.  I'll log an issue in
JIRA and update the source.

thanks otis,
jeff


On 7/5/07, Otis Gospodnetic [EMAIL PROTECTED] wrote:


Hi,



Here is a quick one for Jeff R. about his SolrSharp client.  Looking at

http://solrstuff.org/svn/solrsharp/src/Query/Parameters/QueryParameter.cs, I 
see boost defined as an int(eger):


private int _boost = 1;



Lucene's boosts are floats (see
http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/search/Query.html#getBoost())



Is there a reason boosts are ints in SolrSharp?



Thanks,

Otis







Re: solrsharp thoughts

2007-07-05 Thread Jeff Rodenburg

Thanks Ryan.  Comments below.

On 7/5/07, Ryan McKinley [EMAIL PROTECTED] wrote:


I just took a quick look at solrsharp.  I don't really have to use it
yet, so this is not an in depth review.

I like the templated SearchResults -- that seems useful.



That has proven to be extremely useful in our implementation.  The template
gives you the base stuff, and the implementation allows us to strongly type
our results which makes programmatic usage easier to deal with.


I don't quite follow the need to parse the SolrSchema on the client

side?  Is that to know what fields are available?  Could the same thing
be achieved reading the response from the luke request handler?  I only
worry about it is as something to keep in sync with the java impl.



There's no real need to parse SolrSchema in order to execute searches or to
add/update docs to the search index.  The SolrSchema object was just a way
of gathering schema definition and using it for whatever purpose it might
make sense.

It would be good to be able to change the request paths.  While /select

and /update will usually work, it is possible to put stuff elsewhere.



This is a TODO item.  The original library was constructed around the
default paths, prior to the 1.1 release.  The TODO item is actually one
where named request handlers should be accessible objects that can be
assigned to both searches and index updates.

nitpick: FacetParameter.Zeros - should use facet.mincount instead.

(facet.mincount=0 is the same behavior)



Yep, another TODO item.  I actually have this in place in development,
pending a review of all facet parameters against the 1.2 release for
accuracy.

ryan




SolrRequestHandler question

2007-06-28 Thread Jeff Rodenburg

I have a search use case that requires that I use the results of search from
IndexA and apply them as a query component of a second search to IndexB.
(The nature of the data doesn't allow me to combine these indexes).
At present, this is handled at the client level: search one index, get the
results, apply them to a search against another index.

I can't change the two-query fundamentals, but I'd like to hide the
implementation from the client.  If I wanted to concentrate this logic at
the server, should I be considering a custom request handler?

The request handler would:
- accept the query parameters
- use a subset of parameters to build a query against another search index
- execute that query, gather the results
- use those results as new parameters in another query
- execute the second query

I'm sure this isn't atypical, how are others accomplishing this?

thanks,
j


Re: Recent updates to Solrsharp

2007-06-21 Thread Jeff Rodenburg

great, thanks Yonik.

On 6/20/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 6/21/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 As an aside, it would be nice to record these issues more granularly in
 JIRA.  Could we get a component created for our client library, similar
to
 java/php/ruby?

Done.

-Yonik



Re: SolrSharp example

2007-06-20 Thread Jeff Rodenburg

Hi Michael -

Moving this conversations to the general solr mailing list...



 1. SolrSharp example solution works with schema.xml from

apache-solr-1.1.0-incubating.If I'm using schema.xml from
apache-solr-1.2.0 example program doesn't update index...

I didn't realize the solr 1.2 release code sample schema.xml was different
from the solr 1.1 version.  In my implementation, I had solr 1.1 already
installed and upgraded to 1.2 by replacing the war file (per the
instructions in solr.)  So, the example code is geared to go against
the 1.1schema.

For the example code, adding the timestamp field in the
ExampleIndexDocument public constructor such as:

   this.Add(new IndexFieldValue(timestamp, DateTime.Now.ToString
(s)+Z)));

will take care of the solr 1.2 schema invalidation issue.

The addition of the @default attribute on this field in the schema is not
presently accommodated in the validation routine.  If I'm not mistaken, the
default attribute value will be applied for all documents without that field
present in the xml payload.  This would imply that any field with a default
attribute is not required for any implemented UpdateIndexDocument.  I'll
look into this further.



2. When I run example with schema.xml from

apache-solr-1.1.0-incubating program
throw Exception

Hmmm, can't really help you with this one.  It sounds as if solr is
incurring an error when the xml is posted to the server.  Try the standard
step-through troubleshooting routines to see what messages are being passed
back from the server.



-- j







On 6/19/07, Michael Plax [EMAIL PROTECTED] wrote:


 Hello Jeff,

thank you again for updating files.
I just run with some  problems. I don't know what is the best way to
report them solr maillist/solrsharp jira.

1.
SolrSharp example solution works with schema.xml from
apache-solr-1.1.0-incubating.
   If I'm using schema.xml from apache-solr-1.2.0 example program doesn't
update index because:

   line 33: if (solrSearcher.SolrSchema.IsValidUpdateIndexDocument(iDoc))
return false.
   update falls because of configuration file

schema.xml file:

line 265: field name=word type=string indexed=true
stored=true/
...
line 279:field name=timestamp type=date indexed=true
stored=true default=NOW multiValued=false/

those fields word, timestamp don't pass validation in SolrSchema.csline 217.

2.
When I run example with schema.xml from apache-solr-1.1.0-incubating
program throw Exception

System.Exception was unhandled
  Message=Http error in request/response to
http://localhost:8983/solr/update/;
  Source=SolrSharp
  StackTrace:
   at org.apache.solr.SolrSharp.Configuration.SolrSearcher.WebPost(String
url, Byte[] bytesToPost, String statusDescription) in
E:\SOLR-CSharp\src\Configuration\SolrSearcher.cs:line 229
   at org.apache.solr.SolrSharp.Update.SolrUpdater.PostToIndex(IndexDocument
oDoc, Boolean bCommit) in E:\SOLR-CSharp\src\Update\SolrUpdater.cs:line 70
   at SolrSharpExample.Program.Main(String[] args) in
E:\SOLR-CSharp\example\Program.cs:line 35
   at System.AppDomain.nExecuteAssembly(Assembly assembly, String[]
args)
   at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence
assemblySecurity, String[] args)
   at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly
()
   at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
   at System.Threading.ExecutionContext.Run(ExecutionContext
executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

xmlstring value from oDoc.SerializeToString()

?xml version=\1.0\ encoding=\utf-8\?add xmlns:xsi=\
http://www.w3.org/2001/XMLSchema-instance\http://www.w3.org/2001/XMLSchema-instance%5C
xmlns:xsd=\http://www.w3.org/2001/XMLSchema\;docfieldhttp://www.w3.org/2001/XMLSchema%5C%22%3E%3Cdoc%3E%3Cfieldname=\id\101/fieldfield
 name=\name\One oh one/fieldfield
name=\manu\Sony/fieldfield name=\cat\Electronics/fieldfield
name=\cat\Computer/fieldfield name=\features\Good/fieldfield
name=\features\Fast/fieldfield name=\features\Cheap/fieldfield
name=\includes\USB cable/fieldfield name=\weight\1.234/fieldfield
name=\price\99.99/fieldfield name=\popularity\1/fieldfield
name=\inStock\True/field/doc/add

I checked  all features from Solr tutorial, they are working. I'm running
solr on Windows XP Pro without firewall.

Do you know how to solve those problems? Do you recommend to handle all
communication by maillist/jira ?

Regards
Michael




Re: SolrSharp example

2007-06-20 Thread Jeff Rodenburg

On 6/20/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 6/20/07, Michael Plax [EMAIL PROTECTED] wrote:
 This is a log that I got after runnning SolrSharp example. I think
example
 program posts not properly formatted xml.
 I'm running Solr on Windows XP, Java 1.5. Are those settings could be
the
 problem?

Solr1.2 is pickier about the Content-type in the HTTP headers.
I bet it's being set incorrectly.




Ahh, good point.  Within SolrSearcher.cs, the WebPost method contains this
setting:

oRequest.ContentType = application/x-www-form-urlencoded;

Looking through the CHANGES.txt file in the 1.2 tagged release on svn:

9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler using
the new request dispatcher (SOLR-104). This requires posted content to have
a valid contentType: curl -H 'Content-type:text/xml; charset=utf-8'.  The
response format matches that of /select and returns standard error codes. To
enable solr1.1 style /update, do not map /update to any handler in
solrconfig.xml (ryan)

For SolrSearcher.cs, it sounds as though changing the ContentType setting to
text/xml may fix this issue.

I don't have a 1.2 instance to test this against available to me right now,
but can check this later.  Michael, try updating your SolrSearcher.cs file
for this content-type setting to see if that resolves your issue.


thanks,
jeff r.


Re: SolrSharp example

2007-06-20 Thread Jeff Rodenburg

Thanks for checking, Michael -- great find.  I'm in process of readying this
same fix for inclusion in the source code (I'm verifying against a
full 1.2install.)

The SolrField class is now also being extended to incorporate an IsDefaulted
property, which will permit the SolrSchema.IsValidUpdateIndexDocument to
yield true when default value fields aren't present in the update request.

thanks,
jeff r.



On 6/20/07, Michael Plax [EMAIL PROTECTED] wrote:


Hello,

Yonik and Jeff thank you for your help.
You are right this was content-type issue.

in order to run example  following things need to be done:

1.Code (SolrSharp) should be changed
from:
src\Configuration\SolrSearcher.cs(217):oRequest.ContentType =
application/x-www-form-urlencoded;
to:
src\Configuration\SolrSearcher.cs(217):oRequest.ContentType =
text/xml;

2. In order take care of the solr 1.2 schema invalidation issue:
schema.xml
comment line: 265
!-- field name=word type=string indexed=true stored=true/--
comment line: 279
!-- field name=timestamp type=date indexed=true stored=true
default=NOW multiValued=false/--
or as Jeff suggested:
For the example code, adding the timestamp field in the
ExampleIndexDocument public constructor such as:
this.Add(new IndexFieldValue(timestamp,
DateTime.Now.ToString(s)+Z)));

Regards
Michael




- Original Message -
From: Jeff Rodenburg [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, June 20, 2007 1:56 PM
Subject: Re: SolrSharp example


 On 6/20/07, Yonik Seeley [EMAIL PROTECTED] wrote:

 On 6/20/07, Michael Plax [EMAIL PROTECTED] wrote:
  This is a log that I got after runnning SolrSharp example. I think
 example
  program posts not properly formatted xml.
  I'm running Solr on Windows XP, Java 1.5. Are those settings could be
 the
  problem?

 Solr1.2 is pickier about the Content-type in the HTTP headers.
 I bet it's being set incorrectly.



 Ahh, good point.  Within SolrSearcher.cs, the WebPost method contains
this
 setting:

 oRequest.ContentType = application/x-www-form-urlencoded;

 Looking through the CHANGES.txt file in the 1.2 tagged release on svn:

 9. The example solrconfig.xml maps /update to XmlUpdateRequestHandler
 using
 the new request dispatcher (SOLR-104). This requires posted content to
 have
 a valid contentType: curl -H 'Content-type:text/xml;
charset=utf-8'.  The
 response format matches that of /select and returns standard error
codes.
 To
 enable solr1.1 style /update, do not map /update to any handler in
 solrconfig.xml (ryan)

 For SolrSearcher.cs, it sounds as though changing the ContentType
setting
 to
 text/xml may fix this issue.

 I don't have a 1.2 instance to test this against available to me right
 now,
 but can check this later.  Michael, try updating your SolrSearcher.csfile
 for this content-type setting to see if that resolves your issue.


 thanks,
 jeff r.





Recent updates to Solrsharp

2007-06-20 Thread Jeff Rodenburg

Thanks to Yonik, Michael, Ryan, (and others) for some recent help on various
issues discovered with Solrsharp.  We were able to discover a few issues
with the library relative to the Solr 1.2 release.  Those issues have been
remedied and have been pushed into source control.

The Solrsharp source code can be obtained at:
http://solrstuff.org/svn/solrsharp.

Recent fixes include:
- Fix for broken DeleteIndexDocument xml serialization
- Update to correct document posting content-type to solr 1.2 instance
- Identifying schema fields with new IsDefaulted property
- Updates to the example application to incorporate these fixes and the solr
1.2 sample schema
- Updated documentation consistent with these changes

As an aside, it would be nice to record these issues more granularly in
JIRA.  Could we get a component created for our client library, similar to
java/php/ruby?

cheers,
j


Update to SolrSharp

2007-06-13 Thread Jeff Rodenburg

Solrsharp has been validated against the Solr 1.2 release.  Validation was
made using the example application that's available with the Solrsharp code
against a default example index with the Solr 1.2 released bits.

- The source code for Solrsharp is now accessible via subversion.  Many
thanks to Ryan McKinley for hosting the codebase.  You can find it at:

   http://solrstuff.org/svn/solrsharp

- A new folder has been added: docs/api.  We have MSDN-style documentation
to help explain the full library.  When you update from the repository, just
point your browser to the local file at /docs/api/index.html.

As always, send your praise or complaints this direction.

cheers,
jeff r.


Re: solr+hadoop = next solr

2007-06-08 Thread Jeff Rodenburg

On 6/7/07, Rafael Rossini [EMAIL PROTECTED] wrote:


Hi, Jeff and Mike.

   Would you mind telling us about the architecture of your solutions a
little bit? Mike, you said that you implemented a highly-distributed
search
engine using Solr as indexing nodes. What does that mean? You guys
implemented a master, multi-slave solution for replication? Or the whole
index shards for high availability and fail over?



Our solution doesn't use solr, but goes directly to lucene.  It's built on
windows, so the interop communication service is built on .net remoting (tcp
based).  Microsoft has deprecated ongoing development with .net remoting, in
favor of other more standard mechanisms, i.e. http.  So, we're looking to
migrate our solution to a more community-supported model.

The underlying structure sounds similar to what others have done: index
shards distributed to various servers, each responsible for a subset of the
index.  A merging server handles coordination of concurrent thread requests
and synchronizes the results as they're returned.  The thread coordination
and search results interleaving process is functional but not really
scalable.  It works for our user model, where users tend not to page deeply
through results.  We want to change that so we can use solr as our primary
data source read mechanism for our site.

-- j


Re: solr+hadoop = next solr

2007-06-07 Thread Jeff Rodenburg

Mike - thanks for the comments.  Some responses added below.

On 6/7/07, Mike Klaas [EMAIL PROTECTED] wrote:



I've implemented a highly-distributed search engine using Solr (200m
docs and growing, 60+ servers).   It is not a Solr-based solution in
the vein of FederatedSearch--it is a higher-level architecture that
uses Solr as indexing nodes.  I'll note that it is a lot of work and
would be even more work to develop in the generic extensible
philosophy that Solr espouses.



Yeah, we've done the same thing in the .Net world, and it's a tough slog.
We're in the same situation -- making our solution generically extensible is
pretty much a non-starter.


In terms of the FederatedSearch wiki entry (updated last year), has
 there
 been any progress made this year on this topic, at least something
 worthy of
 being added or updated to the wiki page?  Not to splinter efforts
 here, but
 maybe a working group that was focused on that topic could help to
 move
 things forward a bit.

I don't believe that absence of organization has been the cause of
lack of forward progress on this issue, but simply that there has
been no-one sufficiently interested and committed to prioritizing
this huge task to work on it.  There is no need to form a working
group (not when there are only a handful of active committers to
begin with)--all interested people could just use solr-dev@ for
discussion.



That makes sense, just didn't want to bombard the list with the subject if
it was a detractor from the core project, i.e. keep lucene messages on
lucene, solr messages on solr, etc.  The good-community-participant
approach, if you will.

Solr is an open-source project, so huge features will get implemented

when there is a person or group of people devoted to leading the
charge on the issue.  If you're interested in being that person,
that's great!



Glad to jump in, not sure I qualify as such for that, but certainly a big
cheerleader nonetheless.


Re: solr+hadoop = next solr

2007-06-06 Thread Jeff Rodenburg

I've been exploring distributed search, as of late.  I don't know about the
next solr but I could certainly see a distributed solr grow out of such
an expansion.

In terms of the FederatedSearch wiki entry (updated last year), has there
been any progress made this year on this topic, at least something worthy of
being added or updated to the wiki page?  Not to splinter efforts here, but
maybe a working group that was focused on that topic could help to move
things forward a bit.

- j

On 6/6/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 6/6/07, James liu [EMAIL PROTECTED] wrote:
 anyone agree?

No ;-)

At least not if you mean using map-reduce for queries.

When I started looking at distributed search, I immediately went and
read the map-reduce paper (easier concept than it first appeared), and
realized it's really more for the indexing side of things (big batch
jobs, making data from data, etc).  Nutch uses map reduce for
crawling/indexing, but not for querying.

-Yonik



Re: distributed search

2007-06-03 Thread Jeff Rodenburg

David -

It depends on what distributed means in your question.

If you're looking for high availability, that can be accomplished through
typical load balancing schemes for the servlet container that's running
solr.  Solr helps out in this respect with a replication scheme using rsync
that keeps the indexes updated on all load-balanced nodes.

If you're looking for support for bigger indexes that don't fit inside one
solr instance (multiple solr instances = one search index), it's presently
not available (as far as I know.)  Work has progressed in the area of
federated search (http://wiki.apache.org/solr/FederatedSearch).  There are
many challenges to accomplishing this; the wiki outlines where progress has
been made.

-- j



On 6/3/07, David Xiao [EMAIL PROTECTED] wrote:


Hello all,



Is there distributed support in Solr search engine? For example install
solr instance on different server and have them load balanced.

Anyway, any suggestion/experience about Solr distributed search topic is
appreciated.



Regards,

David




Re: read only indexes?

2007-05-25 Thread Jeff Rodenburg

We're controlling this with Tomcat configuration on our end.  I'm not a
servlet-container guru, but I would imagine similar capabilities exist on
Jetty, et al.

-- j

On 5/24/07, Ryan McKinley [EMAIL PROTECTED] wrote:


Is there a good way to force an index to be read-only?

I could configure a dummy handler to sit on top of /update and throw an
error, but i'd like a stronger assurance that nothing can call
UpdateHandler.addDoc()




Solrsharp feedback

2007-04-24 Thread Jeff Rodenburg

I sent a few messages to the list about Solrsharp, the C# library for
working with Solr, a couple of weeks ago.  This was the first iteration of
the library and something I expected to see modified as others got a chance
to review it.  I've not heard any feedback since then, though.

For those that have checked out the code, is it working for you?  Does it
make sense?

thanks,
jeff r.


Re: Requests per second/minute monitor?

2007-04-18 Thread Jeff Rodenburg

Not yet from us, but I'm thinking about a nagios plugin for Solr.  It would
be tomcat-based for the http stuff, however.

On 4/18/07, Walter Underwood [EMAIL PROTECTED] wrote:


Is there a good spot to track request rate in Solr? Has anyone
built a monitor?

wunder
--
Search Guru
Netflix




Re: SolrSharp - a C# client API for Solr

2007-04-10 Thread Jeff Rodenburg

It will be extremely helpful to get this in the hands of others.  Like most
packages, this was built out of need.  As we get more eyes on it, I hope to
see it improve at the same rate as change in Solr.

I promised a few other additions to this set.  Here's what I'm working on:

- More content within the documentation about how to use the api.  It's
strongly object-oriented and usage requires you to put together your own set
of classes that inherit from abstract classes in the library.  The example
code does it, but it's not clear how or why you do it, so some guidance is
needed.  I should probably add a wiki entry on the Solr site as well.
- Nunit tests need to be added.  These always get complex when involving
distributed systems, but such is life.

-- jeff



On 4/10/07, JimS [EMAIL PROTECTED] wrote:


Thanx for the great contribution Jeff!  A hand clap to the Solr team too.


I am looking forward to using Solr and Solr# in the coming months.  Your
client is going to be a great help.

regards,
-jim


On 4/9/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:

 All -

 I'm proud to announce a release to a new client API for Solr --
SolrSharp.
 SolrSharp is a C# library that abstracts the interoperation of a solr
 search
 server.  This is an initial release that covers the basics of working
with
 Solr.  The library is very fleshed out, but the example has only
 implemented
 simple keyword search.  I really like the library (I'm a dogfood user,
for
 sure) because I can strongly type different types of objects to search
 results.

 There's more forthcoming, i.e. more examples, but the basics are in
place.
 Feedback always appreciated, suggestions for improvement are nice, and
 helping hands are the best.

 Until there's a better home for it, you can download the bits from JIRA
 at:
 https://issues.apache.org/jira/browse/SOLR-205

 cheers,
 jeff r.




Re: Question about code contribution

2007-04-09 Thread Jeff Rodenburg

Perfect, thanks Otis.  Nice to hear from you, btw.

cheers,
j



On 4/6/07, Otis Gospodnetic [EMAIL PROTECTED] wrote:


Yes, each file needs to contain the license.  Look at any .java file to
see what should go there and where.

Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Jeff Rodenburg [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Friday, April 6, 2007 11:16:28 AM
Subject: Re: Question about code contribution

Whoops, typo: ...do the source code files need to contain the boilerplate
Apache license.



On 4/6/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:

 If I'm contributing new source files (separate project entirely) through
 JIRA, so the source code files need to contain the boilerplate Apache
 license/disclaimers and the like?  This is new code and a new project
(C#),
 and the wiki page on contributions (
 http://wiki.apache.org/solr/HowToContribute) is mostly concerned with
core
 Solr code.

 If there's a checklist of items that should be included, please forward
or
 send me the link.

 cheers,
 j







SolrSharp - a C# client API for Solr

2007-04-09 Thread Jeff Rodenburg

All -

I'm proud to announce a release to a new client API for Solr -- SolrSharp.
SolrSharp is a C# library that abstracts the interoperation of a solr search
server.  This is an initial release that covers the basics of working with
Solr.  The library is very fleshed out, but the example has only implemented
simple keyword search.  I really like the library (I'm a dogfood user, for
sure) because I can strongly type different types of objects to search
results.

There's more forthcoming, i.e. more examples, but the basics are in place.
Feedback always appreciated, suggestions for improvement are nice, and
helping hands are the best.

Until there's a better home for it, you can download the bits from JIRA at:
https://issues.apache.org/jira/browse/SOLR-205

cheers,
jeff r.


Question about code contribution

2007-04-06 Thread Jeff Rodenburg

If I'm contributing new source files (separate project entirely) through
JIRA, so the source code files need to contain the boilerplate Apache
license/disclaimers and the like?  This is new code and a new project (C#),
and the wiki page on contributions (
http://wiki.apache.org/solr/HowToContribute) is mostly concerned with core
Solr code.

If there's a checklist of items that should be included, please forward or
send me the link.

cheers,
j


Re: Question about code contribution

2007-04-06 Thread Jeff Rodenburg

Whoops, typo: ...do the source code files need to contain the boilerplate
Apache license.



On 4/6/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:


If I'm contributing new source files (separate project entirely) through
JIRA, so the source code files need to contain the boilerplate Apache
license/disclaimers and the like?  This is new code and a new project (C#),
and the wiki page on contributions (
http://wiki.apache.org/solr/HowToContribute) is mostly concerned with core
Solr code.

If there's a checklist of items that should be included, please forward or
send me the link.

cheers,
j



Re: C# API for Solr

2007-04-05 Thread Jeff Rodenburg

I'm working on it right now.  The library is largely done, but I need to add
some documentation and a few examples for usage.

No promises, but I hope to have something available in the next few days.

-- j

On 4/5/07, Mike Austin [EMAIL PROTECTED] wrote:


I would be very interested in this. Any idea on when this will be
available?

Thanks

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Monday, April 02, 2007 1:44 AM
To: solr-user@lucene.apache.org
Subject: Re: C# API for Solr


Well, i think there will be a lot of people who will be very happy with
this C# client.

grts,m




Jeff Rodenburg [EMAIL PROTECTED]
31/03/2007 18:00
Please respond to
solr-user@lucene.apache.org


To
solr-user@lucene.apache.org
cc

Subject
C# API for Solr






We built our first search system architecture around Lucene.Net back in
2005
and continued to make modifications through 2006.  We quickly learned that
search management is so much more than query algorithms and indexing
choices.  We were not readily prepared for the operational overhead that
our
Lucene-based search required: always-on availability, fast response times,
batch and real-time updates, etc.

Fast forward to 2007.  Our front-end is Microsoft-based, but we needed to
support parallel development on non-Microsoft architecture, and thus
needed
a cross-platform search system.  Hello Solr!  We've transitioned our
search
system to Solr with a Linux/Tomcat back-end, and it's been a champ.  We
now
use solr not only for standard keyword search, but also to drive queries
for
lots of different content sections on our site.  Solr has moved beyond
mission critical in our operation.

As we've proceeded, we've built out a nice C# client library to abstract
the
interaction from C# to Solr.  It's mostly generic and designed for

extensibilty.  With a few modifications, this could be a stand-alone
library
that works for others.

I have clearance from the organization to contribute our library to the
community if there's interest.  I'd first like to gauge the interest of
everyone before doing so; please reply if you do.

cheers,
jeff r.





Re: problems finding negative values

2007-04-04 Thread Jeff Rodenburg

This one caught us as well.

Refer to
http://lucene.apache.org/java/docs/queryparsersyntax.html#Escaping%20Special%20Charactersfor
understanding what characters need to be escaped for your queries.



On 4/4/07, galo [EMAIL PROTECTED] wrote:


Hi,

I have an index consisting on the following fields:

field name=id type=long indexed=true stored=true/
field name=length type=integer indexed=true stored=true/
field name=key type=integer indexed=true stored=true
multiValued=true /

Each doc has a few key values, some of which are negative.

Ok, I know there's a document that has both 826606443 and -1861807411

If I search with


http://localhost:8080/solr/select/?stylesheet=version=2.1start=0rows=50indent=onq=-1861807411fl=id,length,key

I get no results, but if I do


http://localhost:8080/solr/select/?stylesheet=version=2.1start=0rows=50indent=onq=826606443fl=id,length,key

I get the document as expected.

Obviously the key field is configured as a search field, indexed, etc.
but somehow solr doesn't like negatives. I'm assuming this might have
something to do with analysers but can't tell how to fix it.. any ideas??

Thanks

galo



Re: org.apache.jasper.JasperException: Exception in JSP: /admin/_info.jsp:27

2007-04-03 Thread Jeff Rodenburg

Whenever I've encountered this, the cause has nearly always been starting
tomcat with the proper current working directory.
I went through the example install a few weeks ago, line by line, from the
wiki page for Tomcat and it ran fine.  I'm running 5.5.17, and have done
this on both FC5 and FC6.

Other things of importance: proper chmod settings on /bin under
apache-tomcat.

Hope this helps.

-- j

On 4/3/07, Karen Loughran [EMAIL PROTECTED] wrote:




Hi all,

I'm trying to install Solr in a Tomcat 5.5.17 container on Linux Fedora
core 5.  I receive org.apache.jasper.JasperException: Exception in
JSP: /admin/_info.jsp:27.  Full error given below.

I'm following the instructions on the WIKI

I have copied the solr.war (from apache-solr-1.1.0) to
$CATALINA_HOME/webapps.
I have copied the example solr home example/solr as a template for my
solr home.
I then start tomcat from the same directory which contains this solr
directory as instructed in the wiki.


Any help would be much appreciated,
Thanks
Karen


Full Error:

type Exception report
INFO: Deploying web application archive solr.war
Apr 3, 2007 2:52:47 PM org.apache.solr.servlet.SolrServlet init
INFO: SolrServlet.init()
Apr 3, 2007 2:52:47 PM org.apache.solr.servlet.SolrServlet init
INFO: No /solr/home in JNDI
Apr 3, 2007 2:52:47 PM org.apache.solr.servlet.SolrServlet init


message

description The server encountered an internal error () that prevented
it from fulfilling this request.

exception

org.apache.jasper.JasperException: Exception in JSP: /admin/_info.jsp:27

24:
25: %-- jsp:include page=header.jsp/ --%
26: %-- do a verbatim include so we can use the local vars --%
27: [EMAIL PROTECTED] file=header.jsp %
28:
29: br clear=all
30: table


Stacktrace:
org.apache.jasper.servlet.JspServletWrapper.handleJspException(
JspServletWrapper.java:504)
org.apache.jasper.servlet.JspServletWrapper.service(
JspServletWrapper.java:375)
org.apache.jasper.servlet.JspServlet.serviceJspFile(
JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

root cause

javax.servlet.ServletException
org.apache.jasper.runtime.PageContextImpl.doHandlePageException(
PageContextImpl.java:858)
org.apache.jasper.runtime.PageContextImpl.handlePageException(
PageContextImpl.java:791)
org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:313)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.apache.jasper.servlet.JspServletWrapper.service(
JspServletWrapper.java:332)
org.apache.jasper.servlet.JspServlet.serviceJspFile(
JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

root cause

java.lang.NoClassDefFoundError
org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:80)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
org.apache.jasper.servlet.JspServletWrapper.service(
JspServletWrapper.java:332)
org.apache.jasper.servlet.JspServlet.serviceJspFile(
JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)




Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Hoping I can get a better response with a more directed question:

With facet queries and the fields used, what qualifies as a large number
of values?  The wiki uses U.S. states as an example, so the number of unique
values = 50.  More to the point, is there an algorithm that I can use to
estimate the cache consumption rate for facet queries?

-- j




On 4/1/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:


I've read through the list entries here, the Lucene list, and the wiki
docs and am not resolving a major pain point  for us.  We've been trying to
determine what could possibly cause us to hit this in our given environment,
and am hoping more eyes on this issue can help.

Our scenario: 150MB index, 14 documents, read/write servers in place
using standard replication.  Running Tomcat 5.5.17 on Redhat Enterprise
Linux 4.  Java configured to start with -Xmx1024m.  We encounter java heap
out-of-memory issues on the read server at staggered times, but usually once
every 48 hours.  Search request load is roughly 2 searches every 3 seconds,
with some spikes here or there.  We are using facets: 3 are based on type
integer, one is based on type string.  We are using sorts: 1 is based on
type sint, 2 are based on type date.  Caching is disabled.  Solr bits are
also from September 2006.

Is there anything in that configuration that we should interrogate?

thanks,
j



Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

On 4/2/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 4/1/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 Our scenario: 150MB index, 14 documents, read/write servers in place
 using standard replication.  Running Tomcat 5.5.17 on Redhat Enterprise
 Linux 4.  Java configured to start with -Xmx1024m.  We encounter java
heap
 out-of-memory issues on the read server at staggered times, but usually
once
 every 48 hours.

Could you do a grep through your server logs for WARNING, to
eliminate the possibility of multiple overlapping searchers causing
the OOM issue?



We're not seeing warnings for overlapping searchers prior to the oom
events.  Only SEVERE -- java.lang.OutOfMemoryError: Java heap space.

Are you doing incremental updates?  If so, try lowering your

mergeFactor for the index, or optimize more frequently.  As an index
is incrementally updated, old docs are marked as deleted and new docs
are added.  This leaves holes in the document id space which can
increase memory usage.  Both BitSet filters and FieldCache entry sizes
are proportionally related to maxDoc (the maximum internal docid in
the index).

You can see maxDoc from the statistics page... there might be a
correlation.



We are doing incremental updates, and we optimize quite a bit.  mergeFactor
presently set to 10.
maxDoc count = 144156
numDocs count = 144145


Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Thanks for the pointers, Mike.  I'm trying to determine the math to resolve
some strange numbers we're seeing.  Here's the top dozen lines from a jmap
analysis on a heap dump:

SizeCount Class description
-
428246064   1792204   int[]
931751763213131   char[]
771950403216460   java.lang.String
674791123945  long[]
530738881658559   java.util.LinkedHashMap$Entry
396683521652848   org.apache.solr.search.HashDocSet
2819528027131 byte[]
271654561697841   org.apache.lucene.index.Term
270240161689001   org.apache.lucene.search.TermQuery
22265920695810org.apache.lucene.document.Field
4931568 5974  java.lang.Object[]
4366768 77978 org.apache.lucene.store.FSIndexInput

I see the HashDocSet numbers (count=1.65 million), assume they have
references to the int arrays (count=1.79 million)  and wonder how I could
have so many of those in memory.  A few more data tidbits:

- Facet field Id1 = type int, unique values = 2710
- Facet field Id2 = type int, unique values = 65
- Facet field Id3 = type string, unique values = 15179

Thanks for the extra eyes on this, much appreciated.

-- j



On 4/2/07, Mike Klaas [EMAIL PROTECTED] wrote:


On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 With facet queries and the fields used, what qualifies as a large
number
 of values?  The wiki uses U.S. states as an example, so the number of
unique
 values = 50.  More to the point, is there an algorithm that I can use to
 estimate the cache consumption rate for facet queries?

The cache consumption rate is one entry per unique value in all
faceted fields, excluding fields that have faceting satisfied via
FieldCache (single-valued fields with exacly one token per document).

The size of each cached filter is num docs / 8 bytes, unless the
number of maching docs is less than the useHashSet threshold in
solrconfig.xml.

Sorting requires FieldCache population, which consists of an integer
per document plus the sum of the lengths of the unique values in the
field (less for pure int/float fields, but I'm not sure if Solr's sint
qualifies).

Both faceting and sorting shouldn't consume more memory after their
datastructures have been built, so it would be odd to see OOM after 48
hours if they were the cause.

-Mike



Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Sorry for the confusion.  We do have caching disabled.  I was asking the
question because I wasn't certain if the configurable cache settings applied
throughout, or if the FieldCache in lucene still came in play.

The two integer-based facets are single valued per document.  The
string-based facet is multiValued.



On 4/2/07, Chris Hostetter [EMAIL PROTECTED] wrote:



: values = 50.  More to the point, is there an algorithm that I can use to
: estimate the cache consumption rate for facet queries?

I'm confused ... i thought you said in your orriginal mail that you had
all the caching disabled? (except for FieldCache which is so low level in
Lucene it's always used)

are the fields you are faceting on multiValued or single valued?


-Hoss




Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Major version is 1.0.  The bits are from a nightly build from early
September 2006.

We do have plans to upgrade solr soon.

On 4/2/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 We are doing incremental updates, and we optimize quite a
bit.  mergeFactor
 presently set to 10.
 maxDoc count = 144156
 numDocs count = 144145

What version of Solr are you using?  Another potential OOM (multiple
threads generating the same FieldCache entry) was fixed in later
versions of Lucene included with Solr.

-Yonik



Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Yonik - is this the JIRA entry you're referring to?

http://issues.apache.org/jira/browse/LUCENE-754



On 4/2/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 We are doing incremental updates, and we optimize quite a
bit.  mergeFactor
 presently set to 10.
 maxDoc count = 144156
 numDocs count = 144145

What version of Solr are you using?  Another potential OOM (multiple
threads generating the same FieldCache entry) was fixed in later
versions of Lucene included with Solr.

-Yonik



Re: C# API for Solr

2007-04-01 Thread Jeff Rodenburg

What would make things consistent for the client api's is a prescribed set
of implementations for a solr release.  For example, executing searches with
these parameters, support for facets requires those parameters, updates
should be called in this manner, etc.  For lack of a better term, a
loosely-coupled interface definition.  Those requirements could then be
versioned, and the various api's could advertise themselves as solr
1.0compliant, solr
1.1 compliant, and so on.  The solr release dictates the requirements for
compliance; the api maintainer is responsible for meeting those
requirements.  This would also be handy when certain features are
deprecated, i.e. when the /update url is changed.

Regarding C#, this would be easy enough to implement.  There are common
community methods for building/compilation, test libraries, and help
documentation, so doing things consistently with Erik and the solrb library
works for C# as well (and I assume most other languages.)

-- j


On 3/31/07, Chris Hostetter [EMAIL PROTECTED] wrote:



On a related note: We've still never really figured out how to deal with
integrating compilation or testing for client code into our main and build
system -- or for that matter how we should distribute them when we do our
next release, so if you have any suggestions regarding your C# client by
all means speak up ... in the mean time we can do the same thing Erik
started with solrb and flare: an isolated build system that makes sense to
the people who understand that language and rely on community to cacth any
changes to Solr that might break clients.

-Hoss




Re: C# API for Solr

2007-04-01 Thread Jeff Rodenburg

Ryan - I'm working on cleanup to release this thing for the world to enjoy.

-- j

On 3/31/07, Ryan McKinley [EMAIL PROTECTED] wrote:


Yes yes!


On 3/31/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 We built our first search system architecture around Lucene.Net back in
2005
 and continued to make modifications through 2006.  We quickly learned
that
 search management is so much more than query algorithms and indexing
 choices.  We were not readily prepared for the operational overhead that
our
 Lucene-based search required: always-on availability, fast response
times,
 batch and real-time updates, etc.

 Fast forward to 2007.  Our front-end is Microsoft-based, but we needed
to
 support parallel development on non-Microsoft architecture, and thus
needed
 a cross-platform search system.  Hello Solr!  We've transitioned our
search
 system to Solr with a Linux/Tomcat back-end, and it's been a champ.  We
now
 use solr not only for standard keyword search, but also to drive queries
for
 lots of different content sections on our site.  Solr has moved beyond
 mission critical in our operation.

 As we've proceeded, we've built out a nice C# client library to abstract
the
 interaction from C# to Solr.  It's mostly generic and designed for
 extensibilty.  With a few modifications, this could be a stand-alone
library
 that works for others.

 I have clearance from the organization to contribute our library to the
 community if there's interest.  I'd first like to gauge the interest of
 everyone before doing so; please reply if you do.

 cheers,
 jeff r.




C# API for Solr

2007-03-31 Thread Jeff Rodenburg

We built our first search system architecture around Lucene.Net back in 2005
and continued to make modifications through 2006.  We quickly learned that
search management is so much more than query algorithms and indexing
choices.  We were not readily prepared for the operational overhead that our
Lucene-based search required: always-on availability, fast response times,
batch and real-time updates, etc.

Fast forward to 2007.  Our front-end is Microsoft-based, but we needed to
support parallel development on non-Microsoft architecture, and thus needed
a cross-platform search system.  Hello Solr!  We've transitioned our search
system to Solr with a Linux/Tomcat back-end, and it's been a champ.  We now
use solr not only for standard keyword search, but also to drive queries for
lots of different content sections on our site.  Solr has moved beyond
mission critical in our operation.

As we've proceeded, we've built out a nice C# client library to abstract the
interaction from C# to Solr.  It's mostly generic and designed for
extensibilty.  With a few modifications, this could be a stand-alone library
that works for others.

I have clearance from the organization to contribute our library to the
community if there's interest.  I'd first like to gauge the interest of
everyone before doing so; please reply if you do.

cheers,
jeff r.


Re: C# API for Solr

2007-03-31 Thread Jeff Rodenburg

Good thought, Yonik.  I haven't looked at the Java client, would certainly
be worthwhile.  I'll move to prepping the files so they're completely
generic and can work for anyone.

One administrative question: can I contribute these files to be stored under
/lucene/solr/trunk/client?  I don't have a handy place for making these
publicly accessible at the moment.

thanks,
jeff

On 3/31/07, Yonik Seeley [EMAIL PROTECTED] wrote:



C# and Java are so similar, perhaps the Java client in SOLR-20 could
learn something from yours (or vice-versa).

-Yonik



Controlling read/write access for replicated indexes

2007-03-28 Thread Jeff Rodenburg

I'm curious what mechanisms everyone is using to control read/write access
for distributed replicated indexes.  We're moving to a replication
environment very soon, and our client applications (quite a few) all have
configuration pointers to the URLs for solr instances.  As a precaution, I
don't want errant configuration values to inadvertently send write requests
to read servers, as an example.  As an aside, we're running solr under
tomcat 5.5.x which has its own control aspects as well.

Any best practices, i.e. something that's not a maintenance headache later,
from those who have done this would be greatly appreciated.

thanks,
j.r.


Re: Error with bin/optimize and multiple solr webapps

2007-03-06 Thread Jeff Rodenburg

This issue has been logged as:

https://issues.apache.org/jira/browse/SOLR-188

A patch file is included for those who are interested.  I've unit tested in
my environment, please validate it for your own environment.

cheers,
j



On 3/5/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:


Thanks Hoss.  I'll add an issue in JIRA and attach the patch.



On 3/5/07, Chris Hostetter [EMAIL PROTECTED]  wrote:


 : This line assumes a single solr installation under Tomcat, whereas the

 : multiple webapp scenario runs from a different location (the /solr
 part).
 : I'm sure this applies elsewhere.

 good catch ... it looks like all of our scripts assume /solr/update is

 the correct path to POST commit/optimize messages to.

 : I would submit a patch for JIRA, but couldn't find these files under
 version
 : control.  Any recommendations?

 They live in src/scripts ... a patch would ceritanly be apprecaited.

 FYI: there is an evolution underway to allow XML based update messages
 to
 be sent to any path (and the fixed path /update is being deprecated)
 so it would be handy if the entire URL path was configurable (not just
 hte
 webapp name)


 -Hoss





Re: Error with bin/optimize and multiple solr webapps

2007-03-06 Thread Jeff Rodenburg

Oops, my bad I didn't see either 186 or 187 before entering 188.  :-)

-- j

On 3/6/07, Graham Stead [EMAIL PROTECTED] wrote:


Apologies in advance if SOLR-187 and SOLR-188 look the same -- they are
the
same issue. I have been using adjusted scripts locally but hadn't used
Jira
before and wasn't sure of the process. I decided to figure it out after
answering Gola's question this morning...then saw that Jeff had mentioned
a
similar issue last night. I apologize again for confusion over the double
entry.

Thanks,
-Graham

 -Original Message-
 From: Jeff Rodenburg [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, March 06, 2007 4:34 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Error with bin/optimize and multiple solr webapps

 This issue has been logged as:

 https://issues.apache.org/jira/browse/SOLR-188

 A patch file is included for those who are interested.  I've
 unit tested in my environment, please validate it for your
 own environment.

 cheers,
 j



 On 3/5/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 
  Thanks Hoss.  I'll add an issue in JIRA and attach the patch.
 
 
 
  On 3/5/07, Chris Hostetter [EMAIL PROTECTED]  wrote:
  
  
   : This line assumes a single solr installation under
 Tomcat, whereas
   the
  
   : multiple webapp scenario runs from a different location
 (the /solr
   part).
   : I'm sure this applies elsewhere.
  
   good catch ... it looks like all of our scripts assume
   /solr/update is
  
   the correct path to POST commit/optimize messages to.
  
   : I would submit a patch for JIRA, but couldn't find these files
   under version
   : control.  Any recommendations?
  
   They live in src/scripts ... a patch would ceritanly be
 apprecaited.
  
   FYI: there is an evolution underway to allow XML based update
   messages to be sent to any path (and the fixed path /update is
   being deprecated) so it would be handy if the entire URL path was
   configurable (not just hte webapp name)
  
  
   -Hoss
  
  
 






Error with bin/optimize and multiple solr webapps

2007-03-05 Thread Jeff Rodenburg

I noticed an issue with the optimize bash script in /bin.  Per the line:

rs=`curl http://${solr_hostname}:${solr_port}/solr/update -s -d
optimize/`

This line assumes a single solr installation under Tomcat, whereas the
multiple webapp scenario runs from a different location (the /solr part).
I'm sure this applies elsewhere.

I would submit a patch for JIRA, but couldn't find these files under version
control.  Any recommendations?

-- j


Re: Solr graduates and joins Lucene as sub-project

2007-01-17 Thread Jeff Rodenburg

Congrats to all involved committers on the project as well.  Solr is an
invaluable system in my operation.  Great job.

On 1/17/07, Yonik Seeley [EMAIL PROTECTED] wrote:


Solr has just graduated from the Incubator, and has been accepted as a
Lucene sub-project!
Thanks to all the Lucene and Solr users, contributors, and developers
who helped make this happen!

I have a feeling we're just getting started :-)
-Yonik



Re: One item, multiple fields, and range queries

2007-01-17 Thread Jeff Rodenburg

Now I follow.  I was misreading the first comments, thinking that the field
content would be deconstructed to smaller components or pieces.  Too much
(or not enough) coffee.

I'm expecting the index doc needs to be constructed with lat/long/dates in
sequential order, i.e.:

doc
add
  field name=event_id123/field

  field name=latitude32.123456/field
  field name=longitude-88.987654/field
  field name=when01/31/2007/field

  field name=latitude42.123456/field
  field name=longitude-98.987654/field
  field name=when01/31/2007/field

  field name=latitude40.123456/field
  field name=longitude-108.987654/field
  field name=when01/30/2007/field
.etc.

Assuming slop count of 0, while the intention is to match lat/long/when in
that order, could it possibly match long/when/lat, or when/lat/long?  Does
PhraseQuery enforce order and starting point as well?

Assuming all of this, how does range query come into play?  Or could the
PhraseQuery portion be applied as a filter?



On 1/17/07, Chris Hostetter [EMAIL PROTECTED] wrote:



: OK, you lost me.  It sounds as if this PhraseQuery-ish approach involves
: breaking datetime and lat/long values into pieces, and evaluation occurs
: with positioning.  Is that accurate?

i'm not sure what you mean by pieces ... the idea is that you would have a
single latitude field and a single longitude field and a single when
field, and if an item had a single event, you would store a single value
in each field ... but if the item has multiple events, you would store
them in the same relative ordering, and then use the same kind of logic
PhraseQuery uses to verify that if the latitude field has a value in the
right range, and the longitude field has a value in the right range, and
the when field has a value in the right range, that all of those values
have the same position (specificly: are within a set amount of slop from
eachother, which you would allways set to 0)

:  It seems like this could even be done in the same field if one had a
:  query type that allowed querying for tokens at the same position.
:  Just index _noun at the same position as house (and make sure
:  there can't be collisions between real terms and markers via escaping,
:  or use \0 instead of _, etc).

true ... but the point doug made way back when is that with a generalized
multi-field phrase query you wouldn't have to do that escaping ... the
hard part in this case is the numeric ranges.


-Hoss




Re: One item, multiple fields, and range queries

2007-01-16 Thread Jeff Rodenburg

Yonik/Hoss -

OK, you lost me.  It sounds as if this PhraseQuery-ish approach involves
breaking datetime and lat/long values into pieces, and evaluation occurs
with positioning.  Is that accurate?



On 1/16/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 1/15/07, Chris Hostetter [EMAIL PROTECTED] wrote:
 PhraseQuery artificially enforces that the Terms you add to it are
 in the same field ... you could easily write a PhraseQuery-ish query
that
 takes Terms from differnet fields, and ensures that they appear near
 eachother in terms of their token sequence -- the context of that
comment
 was searching for instances of words with specific usage (ie: house
used
 as a noun) by putting the usage type of each term in a different term in
a
 seperate parallel field, but with identicle token positions.

It seems like this could even be done in the same field if one had a
query type that allowed querying for tokens at the same position.
Just index _noun at the same position as house (and make sure
there can't be collisions between real terms and markers via escaping,
or use \0 instead of _, etc).

-Yonik



Re: One item, multiple fields, and range queries

2007-01-15 Thread Jeff Rodenburg

Thanks Hoss.  Interesting approach, but the N bound could be well in the
hundreds, and the N bound would be variable (some maximum number, but
different across events.)

I've not yet used dynamic fields in this manner.  With that number range,
what limitations could I encounter?  Given the size of that, I would need
the solr engine to formulate that query, correct?  I can't imagine I could
pass that entire subquery statement in the http request, as the character
limit would likely be exceeded.

Some of my comments may not make sense, so I'll check into dynamic fields
and such in the meantime.

thanks,
j


On 1/14/07, Chris Hostetter [EMAIL PROTECTED] wrote:



: 2) use multivalued fields as correlated vectors, so the first start
: date corresponds
:to the first end date corresponds to the first lat and long value.
: You get them all back
:in a query though, so your app would need to do extra work to sort
: out which matched.

if you expect a bounded number of correlated events per item, you can
use dynaimc fields, and build up N correlated subqueries where N is the
upper bound on the number of events you expect any item to have, ie...

  (+lat1:[x TO y] +lon1:[w TO z] +time1:[a TO b])
   OR (+lat2:[x TO y] +lon2:[w TO z] +time2:[a TO b])
   OR (+lat3:[x TO y] +lon3:[w TO z] +time3:[a TO b])
   ...




-Hoss




Re: One item, multiple fields, and range queries

2007-01-13 Thread Jeff Rodenburg

Thanks Yonik.


1) model a single document as a single event at a singe place with a start

and end date.

This was my first approach, but at presentation time I need to display the
event once -- with multiple start/end dates and locations beneath it.

Is treating the given event uniqueId as a facet the way to go?

thanks,
jeff


On 1/12/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 1/12/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 I'm stuck with a query issue that at present seems unresolvable.  Hoping
the
 community has some insight to this.

 My index contains events that have multiple beginning/ending date ranges
and
 multiple locations.  For example, event A (uniqueId = 123) occurs every
 weekend, sometimes in one location, sometimes in many locations.  Dates
have
 a beginning and ending date, and locations have a latitude 
longitude.  I
 need to query for the set of events for a given area, where area =
 bounding box.  So, a single event has multiple beginning and ending
dates
 and multiple locations.

 So, the beginning date, ending date, latitude and longitude values only
 apply collectively as a unit.  However, I need to do range queries on
both
 the dates and the lat/long values.

1) model a single document as a single event at a singe place with a
start and end date.
  OR
2) use multivalued fields as correlated vectors, so the first start
date corresponds
   to the first end date corresponds to the first lat and long value.
You get them all back
   in a query though, so your app would need to do extra work to sort
out which matched.

I'd do (1) if you can... it's simpler.

-Yonik



One item, multiple fields, and range queries

2007-01-12 Thread Jeff Rodenburg

I'm stuck with a query issue that at present seems unresolvable.  Hoping the
community has some insight to this.

My index contains events that have multiple beginning/ending date ranges and
multiple locations.  For example, event A (uniqueId = 123) occurs every
weekend, sometimes in one location, sometimes in many locations.  Dates have
a beginning and ending date, and locations have a latitude  longitude.  I
need to query for the set of events for a given area, where area =
bounding box.  So, a single event has multiple beginning and ending dates
and multiple locations.

So, the beginning date, ending date, latitude and longitude values only
apply collectively as a unit.  However, I need to do range queries on both
the dates and the lat/long values.

Any suggested strategies for indexing and query formulation?

thanks,
j


WordDelimiterFilter usage

2007-01-11 Thread Jeff Rodenburg

I'm trying to determine how to index/query for a certain use case, and the
WordDelimiterFilterFactory appears to be what I need to use.  Here's the
scenario:

- Text field being indexed
- Field exists as a full name
- Data might be cold play
- This should match against searches for cold play and coldplay (just
cold and just play are OK as well)

I'm not able to match cold play against searches for coldplay at
present.  I'm certain this is a common scenario and I'm missing something
obvious.  Any suggestions of how/where to look/fix this issue?

thanks,
j


Re: WordDelimiterFilter usage

2007-01-11 Thread Jeff Rodenburg

Thanks Hoss - it is a finite list, but in the tens of thousands.  I'm going
to easy route -- adding another field that indexes the terms with no
included whitespace.  This is used in an ajax-style lookup, so it works for
this scenario.  Not something I'd normally do in a typical index, for sure.

thanks,
jeff


On 1/11/07, Chris Hostetter [EMAIL PROTECTED] wrote:



WordDelimiterFilter wo't really help you in this situations ... but it
would help if you find a lot of users are searching for ColdPlay or
cold-play.

if you have a finite list of popular terms like this that you need to deal
with, the SynonymFilter can help you out.


: Date: Thu, 11 Jan 2007 13:30:39 -0800
: From: Jeff Rodenburg [EMAIL PROTECTED]
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: WordDelimiterFilter usage
:
: I'm trying to determine how to index/query for a certain use case, and
the
: WordDelimiterFilterFactory appears to be what I need to use.  Here's the
: scenario:
:
: - Text field being indexed
: - Field exists as a full name
: - Data might be cold play
: - This should match against searches for cold play and coldplay
(just
: cold and just play are OK as well)
:
: I'm not able to match cold play against searches for coldplay at
: present.  I'm certain this is a common scenario and I'm missing
something
: obvious.  Any suggestions of how/where to look/fix this issue?
:
: thanks,
: j
:



-Hoss




Re: Multiple indexes

2007-01-08 Thread Jeff Rodenburg

This is good information, thanks Chris.  My preference was to keep things
separate, just needed some external info from others to back me up.

thanks,
jeff

On 1/7/07, Chris Hostetter [EMAIL PROTECTED] wrote:



I don't know if there really are any general purpose best practices ... it
really depends on use cases -- the main motivation for allowing JNDI
context specification of the solr.home location so that multiple instances
of SOlr can run in a single instace of a servlet container was so that if
you *wanted* to run multiple instances in a single JVM, they could share
one heap space, and you wouldn't have to guess how much memory to
allocate to multiple instances -- but wether or not you *want* to have a
single instance or not is really up to you.

the plus side (as i mentioned) is that you can throw all of your available
memory at that single JVM instance, and not worry about how much ram each
solr instance really needs.

the down side is that if any one solr instance really gets hammered to
hell by it's users and rolls over and dies, it could bring down your other
solr instances as well -- which may not be a big deal if in your use cases
all solr instances get hit equally (via a meta searcher) but might be
quite a big problem if those seperate instances are completely independent
(ie: each paid for by seperate clients)

personally: if you've got the resources (money/boxes/RAM) i would
recommend keeping everything isolated.

(the nice thing about my job is that while i frequently walk out of
meetings with the directive to make it faster, I've never been asked to
make it use less RAM)


-Hoss




Multiple indexes

2007-01-05 Thread Jeff Rodenburg

I've followed a host of past threads on this subject and am trying to
determine what's best for our implementation.  For those who've chimed in on
this, I think I'm just looking for a good summary (as Hoss recently
mentioned, perhaps a FAQ).

We presently have one index running under Solr/Tomcat55/Linux, which is
continually growing in size.  I have a need to add two other separate
indexes (or is it indices?), which would all carry separate configs.  One
will be small and won't change, the other will grow in size.  For
redundancy, I expect to get into the Solr distribution model.  Collectively,
all three indexes will venture into the 2GB range, so nothing to extensive.

All things considered -- jvm memory management, availability, other things
I've left off the list -- are there any determinations of best practice for
deployment under the topic of multiple index/multiple instance?  Any
specific recommendations for the given details I've provided here?

thanks,
j


Replacing a nightly build

2006-11-07 Thread Jeff Rodenburg

What is the recommended path to deployment for replacing a solr nightly
build with another?  In our scenario, we're updating our current build is
roughly 3 months old.  We're updating to the latest.

Aside from replacing the bits and restarting, are there any steps that
everyone is following in maintaining the code stack under deployment?

thanks.


Re: Error in faceted browsing

2006-09-13 Thread Jeff Rodenburg

Thanks Chris.

I bumped the facet.limit to 10 and it works like a charm.

Thanks for the heads up on the merchant_name.  I would probably just keep a
dictionary in memory, but if I wanted to pull the stored merchant_name back,
how would/can I do that?

thanks,
j

On 9/13/06, Chris Hostetter [EMAIL PROTECTED] wrote:



: I just pulled down the nightly solr build from 9/12 and have it up and
: running.  I copied an index created in a solr version that's about 3
months
: old.

it looks like my changes to have a sensible default (which is when
facet.limit=-1 became legal) didn't make it into solr-2006-09-12.zip, but
it is in solr-2006-09-13.zip.

with the version you are using leaving out the facet.limit should achieve
what you want ... but based on your schema, using merchant_name as a facet
field may not work like you expect -- you'll probably want an exact String
version of the merchant_name field (or just use merchant_id and lookup the
name in a handy Map)

:
: I have a query formulated like this:
:
http://solrbox:8080/solr/select?q=description:dellrows=0facet=truefacet.limit=-1facet.field=merchant_name
:
: The fields definition from schema.xml:
:
:field name=item_id type=long indexed=true stored=true/
:field name=title type=text indexed=true stored=true/
:field name=description type=text indexed=true stored=true/
:field name=merchant_id type=integer indexed=true stored=true
/
:field name=merchant_name type=text indexed=true stored=true
/
:
: The result:
: response
:   responseHeader
:   status0/status
:   QTime2/QTime
: /responseHeader
: result numFound=52 start=0/
:   lst name=facet_counts
:   lst name=facet_queries/
:   str name=exception
: java.util.NoSuchElementException
: at java.util.TreeMap.key(TreeMap.java:433)
: at java.util.TreeMap.lastKey(TreeMap.java:297)
: at java.util.TreeSet.last(TreeSet.java:417)
: at org.apache.solr.util.BoundedTreeSet.adjust(BoundedTreeSet.java
:54)
: at org.apache.solr.util.BoundedTreeSet.setMaxSize(
BoundedTreeSet.java
: :50)
: at org.apache.solr.util.BoundedTreeSet.init(BoundedTreeSet.java
:31)
: at org.apache.solr.request.SimpleFacets.getFacetTermEnumCounts(
: SimpleFacets.java:187)
: at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(
: SimpleFacets.java:137)
: at org.apache.solr.request.SimpleFacets.getFacetCounts(
SimpleFacets.java
: :84)
: at org.apache.solr.request.StandardRequestHandler.getFacetInfo(
: StandardRequestHandler.java:180)
: at org.apache.solr.request.StandardRequestHandler.handleRequest(
: StandardRequestHandler.java:120)
: at org.apache.solr.core.SolrCore.execute(SolrCore.java:586)
: at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:91)
: at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
: at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
: at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
: ApplicationFilterChain.java:252)
: at org.apache.catalina.core.ApplicationFilterChain.doFilter(
: ApplicationFilterChain.java:173)
: at org.apache.catalina.core.StandardWrapperValve.invoke(
: StandardWrapperValve.java:213)
: at org.apache.catalina.core.StandardContextValve.invoke(
: StandardContextValve.java:178)
: at org.apache.catalina.core.StandardHostValve.invoke(
: StandardHostValve.java:126)
: at org.apache.catalina.valves.ErrorReportValve.invoke(
: ErrorReportValve.java:105)
: at org.apache.catalina.core.StandardEngineValve.invoke(
: StandardEngineValve.java:107)
: at org.apache.catalina.connector.CoyoteAdapter.service(
: CoyoteAdapter.java:148)
: at org.apache.coyote.http11.Http11Processor.process(
Http11Processor.java
: :869)
: at
:
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection
: (Http11BaseProtocol.java:664)
: at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(
: PoolTcpEndpoint.java:527)
: at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(
: LeaderFollowerWorkerThread.java:80)
: at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(
: ThreadPool.java:684)
: at java.lang.Thread.run(Thread.java:595)
: /str
: /lst
: /response
:
:
: What am I missing?
:
: -- j
:



-Hoss




Re: Error in faceted browsing

2006-09-13 Thread Jeff Rodenburg

Outstanding, thanks.

- j

On 9/13/06, Yonik Seeley [EMAIL PROTECTED] wrote:


On 9/13/06, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 Thanks for the heads up on the merchant_name.  I would probably just
keep a
 dictionary in memory, but if I wanted to pull the stored merchant_name
back,
 how would/can I do that?

If you don't want merchant_name tokenized at all, just change the type
to string.
If you want an additional field for faceting on with merchant_name
untokenized, then use copyField in schema.xml to copy merchant_name to
merchant_name_exact
and define
  field name=merchant_name_exact type=string indexed=true
stored=false /

-Yonik



Re: Re: IIS web server and Solr integration

2006-09-10 Thread Jeff Rodenburg

Tim -

If you can help it, I would suggest running Solr under Tomcat under Linux.
Speaking from experience in a mixed mode environment, the Linux/Tomcat/Solr
implementation just works.  We're not newbies under Linux, but we're also a
native Windows shop.  The memory management and system availability is just
outstanding in that stack.

If you must run Windows, Tomcat does integrate with IIS, but be prepared to
jump through a few hoops.  Spend time on making that combination work, and
you'll be 90% there

Hope this helps.

-- j

On 9/10/06, Tim Archambault [EMAIL PROTECTED] wrote:


Good news. The rookie did just that. Thanks Chris. Just having a
difficult time how to send my query parameters to the engine from
Coldfusion [intelligently]. I'm going to download the PHP app and see
if I can figure it out. Having lots of fun with this for sure.

Tim

On 9/10/06, Chris Hostetter [EMAIL PROTECTED] wrote:

 : Should it run on a separate port than IIS or integrated using ISAPI
plug-in?

 I can't make any specific recomendations about Windows or IIS, but i
 personally wouldn't Run Solr in the same webserver/appserver that your
 users hit -- from a security standpoint, i would protect your solr
 instance the same way you would protect a database, let the applications
 running in your webserver connect to it and run queries against it, but
 don't expose it to the outside world directly.


 -Hoss





Faceted browsing: status

2006-08-14 Thread Jeff Rodenburg

From the Tasklist wiki:



  -

  Simple faceted browsing (grouping) support in the standard query
  handler
  -

 group by field (provide counts for each distinct value in that
 field)
 -

 group by (query1, query2, query3, query4, query5)


How far/close is this task to completion?  (I'm trying to gauge time/effort
here.)

-- j


Re: Documentation?

2006-05-16 Thread Jeff Rodenburg

Thanks Chris/Yonik, don't know why I didn't see those yet.

-- j

On 5/15/06, Chris Hostetter [EMAIL PROTECTED] wrote:



: I was checking around the solr site and pages at apache.org and wasn't
: finding much.  Before jumping into the code, I'd like to get as familiar
: with solr as I could from existing docs or the like.  Can someone point
me
: in the direction?

The best documentation about using Solr is the tutorial...
http://incubator.apache.org/solr/tutorial.html

The documentation on Solr's internals and developing Query plugins are
pretty sparse at the moment.  It's on my todo list (hopefull this week)

If you want a good chunk of code to sink your teeth into as a starting
point, take a look at StandardRequestHandler, and the APIs it uses from
other classes.

-Hoss




Documentation?

2006-05-15 Thread Jeff Rodenburg

I was checking around the solr site and pages at apache.org and wasn't
finding much.  Before jumping into the code, I'd like to get as familiar
with solr as I could from existing docs or the like.  Can someone point me
in the direction?

thanks,
jeff r.