Is there an issue with hypens in SpellChecker with StandardTokenizer?

2011-12-15 Thread Brandon Fish
I am getting an error using the SpellChecker component with the query
another-test
java.lang.StringIndexOutOfBoundsException: String index out of range: -7

This appears to be related to this
issuehttps://issues.apache.org/jira/browse/SOLR-1630 which
has been marked as fixed. My configuration and test case that follows
appear to reproduce the error I am seeing. Both another and test get
changed into tokens with start and end offsets of 0 and 12.
  analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
filter class=solr.LowerCaseFilterFactory/
  /analyzer

 spellcheck=truespellcheck.collate=true

Is this an issue with my configuration/test or is there an issue with the
SpellingQueryConverter? Is there a recommended work around such as the
WhitespaceTokenizer as mention in the issue comments?

Thank you for your help.

package org.apache.solr.spelling;
import static org.junit.Assert.assertTrue;
import java.util.Collection;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.util.Version;
import org.apache.solr.common.util.NamedList;
import org.junit.Test;
public class SimpleQueryConverterTest {
 @Test
public void testSimpleQueryConversion() {
SpellingQueryConverter converter = new SpellingQueryConverter();
 converter.init(new NamedList());
converter.setAnalyzer(new StandardAnalyzer(Version.LUCENE_35));
String original = another-test;
 CollectionToken tokens = converter.convert(original);
assertTrue(Token offsets do not match,
isOffsetCorrect(original, tokens));
 }
private boolean isOffsetCorrect(String s, CollectionToken tokens) {
for (Token token : tokens) {
 int start = token.startOffset();
int end = token.endOffset();
if (!s.substring(start, end).equals(token.toString()))
 return false;
}
return true;
}
}


Re: Is there an issue with hypens in SpellChecker with StandardTokenizer?

2011-12-15 Thread Brandon Fish
Hi Steve,

I was using branch 3.5. I will try this on tip of branch_3x too.

Thanks.

On Thu, Dec 15, 2011 at 4:14 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Brandon,

 When I add the following to SpellingQueryConverterTest.java on the tip of
 branch_3x (will be released as Solr 3.6), the test succeeds:

 @Test
 public void testStandardAnalyzerWithHyphen() {
   SpellingQueryConverter converter = new SpellingQueryConverter();
  converter.init(new NamedList());
  converter.setAnalyzer(new StandardAnalyzer(Version.LUCENE_35));
  String original = another-test;
  CollectionToken tokens = converter.convert(original);
   assertTrue(tokens is null and it shouldn't be, tokens != null);
  assertEquals(tokens Size:  + tokens.size() +  is not 2, 2,
 tokens.size());
   assertTrue(Token offsets do not match, isOffsetCorrect(original,
 tokens));
 }

 What version of Solr/Lucene are you using?

 Steve

  -Original Message-
  From: Brandon Fish [mailto:brandon.j.f...@gmail.com]
  Sent: Thursday, December 15, 2011 3:08 PM
  To: solr-user@lucene.apache.org
  Subject: Is there an issue with hypens in SpellChecker with
  StandardTokenizer?
 
  I am getting an error using the SpellChecker component with the query
  another-test
  java.lang.StringIndexOutOfBoundsException: String index out of range: -7
 
  This appears to be related to this
  issuehttps://issues.apache.org/jira/browse/SOLR-1630 which
  has been marked as fixed. My configuration and test case that follows
  appear to reproduce the error I am seeing. Both another and test get
  changed into tokens with start and end offsets of 0 and 12.
analyzer
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.LowerCaseFilterFactory/
/analyzer
 
   spellcheck=truespellcheck.collate=true
 
  Is this an issue with my configuration/test or is there an issue with the
  SpellingQueryConverter? Is there a recommended work around such as the
  WhitespaceTokenizer as mention in the issue comments?
 
  Thank you for your help.
 
  package org.apache.solr.spelling;
  import static org.junit.Assert.assertTrue;
  import java.util.Collection;
  import org.apache.lucene.analysis.Token;
  import org.apache.lucene.analysis.standard.StandardAnalyzer;
  import org.apache.lucene.util.Version;
  import org.apache.solr.common.util.NamedList;
  import org.junit.Test;
  public class SimpleQueryConverterTest {
   @Test
  public void testSimpleQueryConversion() {
  SpellingQueryConverter converter = new SpellingQueryConverter();
   converter.init(new NamedList());
  converter.setAnalyzer(new StandardAnalyzer(Version.LUCENE_35));
  String original = another-test;
   CollectionToken tokens = converter.convert(original);
  assertTrue(Token offsets do not match,
  isOffsetCorrect(original, tokens));
   }
  private boolean isOffsetCorrect(String s, CollectionToken tokens) {
  for (Token token : tokens) {
   int start = token.startOffset();
  int end = token.endOffset();
  if (!s.substring(start, end).equals(token.toString()))
   return false;
  }
  return true;
  }
  }



Re: Is there an issue with hypens in SpellChecker with StandardTokenizer?

2011-12-15 Thread Brandon Fish
Yes the branch_3x works for me as well. The addition of the OffsetAttribute
probably corrected this issue.  I will either switch to WhitespaceAnalyzer,
patch my distribution or wait for 3.6 to resolve this.

Thanks.

On Thu, Dec 15, 2011 at 4:17 PM, Brandon Fish brandon.j.f...@gmail.comwrote:

 Hi Steve,

 I was using branch 3.5. I will try this on tip of branch_3x too.

 Thanks.


 On Thu, Dec 15, 2011 at 4:14 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Brandon,

 When I add the following to SpellingQueryConverterTest.java on the tip of
 branch_3x (will be released as Solr 3.6), the test succeeds:

 @Test
 public void testStandardAnalyzerWithHyphen() {
   SpellingQueryConverter converter = new SpellingQueryConverter();
  converter.init(new NamedList());
  converter.setAnalyzer(new StandardAnalyzer(Version.LUCENE_35));
  String original = another-test;
  CollectionToken tokens = converter.convert(original);
   assertTrue(tokens is null and it shouldn't be, tokens != null);
  assertEquals(tokens Size:  + tokens.size() +  is not 2, 2,
 tokens.size());
   assertTrue(Token offsets do not match, isOffsetCorrect(original,
 tokens));
 }

 What version of Solr/Lucene are you using?

 Steve

  -Original Message-
  From: Brandon Fish [mailto:brandon.j.f...@gmail.com]
  Sent: Thursday, December 15, 2011 3:08 PM
  To: solr-user@lucene.apache.org
  Subject: Is there an issue with hypens in SpellChecker with
  StandardTokenizer?
 
  I am getting an error using the SpellChecker component with the query
  another-test
  java.lang.StringIndexOutOfBoundsException: String index out of range: -7
 
  This appears to be related to this
  issuehttps://issues.apache.org/jira/browse/SOLR-1630 which
  has been marked as fixed. My configuration and test case that follows
  appear to reproduce the error I am seeing. Both another and test get
  changed into tokens with start and end offsets of 0 and 12.
analyzer
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
  filter class=solr.LowerCaseFilterFactory/
/analyzer
 
   spellcheck=truespellcheck.collate=true
 
  Is this an issue with my configuration/test or is there an issue with
 the
  SpellingQueryConverter? Is there a recommended work around such as the
  WhitespaceTokenizer as mention in the issue comments?
 
  Thank you for your help.
 
  package org.apache.solr.spelling;
  import static org.junit.Assert.assertTrue;
  import java.util.Collection;
  import org.apache.lucene.analysis.Token;
  import org.apache.lucene.analysis.standard.StandardAnalyzer;
  import org.apache.lucene.util.Version;
  import org.apache.solr.common.util.NamedList;
  import org.junit.Test;
  public class SimpleQueryConverterTest {
   @Test
  public void testSimpleQueryConversion() {
  SpellingQueryConverter converter = new SpellingQueryConverter();
   converter.init(new NamedList());
  converter.setAnalyzer(new StandardAnalyzer(Version.LUCENE_35));
  String original = another-test;
   CollectionToken tokens = converter.convert(original);
  assertTrue(Token offsets do not match,
  isOffsetCorrect(original, tokens));
   }
  private boolean isOffsetCorrect(String s, CollectionToken tokens) {
  for (Token token : tokens) {
   int start = token.startOffset();
  int end = token.endOffset();
  if (!s.substring(start, end).equals(token.toString()))
   return false;
  }
  return true;
  }
  }





Re: How to check if replication is running

2011-09-16 Thread Brandon Fish
Hi Yury,

You could try checking out the details command of the replication handler:
http://slave_host:port/solr/replication?command=details
which has information such as isReplicating.

You could also look at the script attached to this issue which shows a
thorough check of a slaves replication status which could be polled for to
trigger a restart if there is an error.

Brandon

2011/9/16 Yury Kats yuryk...@yahoo.com

 Let's say I'm forcing a replication of a core using fetchindex command.
 No new content is being added to the master.

 I can check whether replication has finished by periodically querying
 master and slave for their indexversion and comparing the two.

 But what's the best way to check if replication is actually happening
 and hasn't been dropped, if for example, there was a network outage
 between master and the slave, in which case, I want to re-start
 replication.

 Thanks,
 Yury




Re: How to check if replication is running

2011-09-16 Thread Brandon Fish
Adding missing link to the issue I mentioned:
https://issues.apache.org/jira/browse/SOLR-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12851462#action_12851462

2011/9/16 Yury Kats yuryk...@yahoo.com

 Let's say I'm forcing a replication of a core using fetchindex command.
 No new content is being added to the master.

 I can check whether replication has finished by periodically querying
 master and slave for their indexversion and comparing the two.

 But what's the best way to check if replication is actually happening
 and hasn't been dropped, if for example, there was a network outage
 between master and the slave, in which case, I want to re-start
 replication.

 Thanks,
 Yury




Re: Data Import from a Queue

2011-07-19 Thread Brandon Fish
Let me provide some more details to the question:

I was unable to find any example implementations where individual documents
(single document per message) are read from a message queue (like ActiveMQ
or RabbitMQ) and then added to Solr via SolrJ, a HTTP POST or another
method. Does anyone know of any available examples for this type of import?

If no examples exist, what would be a recommended commit strategy for
performance? My best guess for this would be to have a queue per core and
commit once the queue is empty.

Thanks.

On Mon, Jul 18, 2011 at 6:52 PM, Erick Erickson erickerick...@gmail.comwrote:

 This is a really cryptic problem statement.

 you might want to review:

 http://wiki.apache.org/solr/UsingMailingLists

 Best
 Erick

 On Fri, Jul 15, 2011 at 1:52 PM, Brandon Fish brandon.j.f...@gmail.com
 wrote:
  Does anyone know of any existing examples of importing data from a queue
  into Solr?
 
  Thank you.
 



Data Import from a Queue

2011-07-15 Thread Brandon Fish
Does anyone know of any existing examples of importing data from a queue
into Solr?

Thank you.


Re: Server Restart Required for Schema Changes After Document Delete All?

2011-06-27 Thread Brandon Fish
I'm not having any issues. I was just asking to see if any backward
incompatible changes exist that would require a server restart. Thanks.

2011/6/27 Tomás Fernández Löbbe tomasflo...@gmail.com

 This should work with dynamic fields too. Are you having any problems with
 it?


 On Thu, Jun 23, 2011 at 3:14 PM, Brandon Fish brandon.j.f...@gmail.com
 wrote:

  Are there any schema changes that would cause problems with the following
  procedure from the
  FAQ
 
 http://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F
  
  ?
 
  1.Use the match all docs query in a delete by query command before
  shutting down Solr: deletequery*:*/query/delete
 
1. Reload core
2. Re-Index your data
 
  Would this work when dynamic fields are removed?
 



Server Restart Required for Schema Changes After Document Delete All?

2011-06-23 Thread Brandon Fish
Are there any schema changes that would cause problems with the following
procedure from the
FAQhttp://wiki.apache.org/solr/FAQ#How_can_I_rebuild_my_index_from_scratch_if_I_change_my_schema.3F
?

1.Use the match all docs query in a delete by query command before
shutting down Solr: deletequery*:*/query/delete

   1. Reload core
   2. Re-Index your data

Would this work when dynamic fields are removed?


Modifying Configuration from a Browser

2011-06-14 Thread Brandon Fish
Does anyone have any examples of modifying a configuration file, like
elevate.xml from a browser? Is there an API that would help for this?

If nothing exists for this, I am considering implementing something that
would change the elevate.xml file then reload the core. Or is there a
better approach for dynamic configuration?

Thank you.