RE: solr.StandardTokenizerFactory: more info needed

2011-07-06 Thread Steven A Rowe
Hi Dmitry,

The underlying Lucene implementation is here: 
http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_1/src/java/org/apache/lucene/analysis/standard/

StandardTokenizerImpl.jflex is probably where you should start.

Steve

-Original Message-
From: Dmitry Kan [mailto:dmitry@gmail.com] 
Sent: Wednesday, July 06, 2011 3:23 AM
To: solr-user@lucene.apache.org
Subject: solr.StandardTokenizerFactory: more info needed

Hi all!

solr.StandardTokenizerFactory -- is it possible to see the full description of 
its behaviour for solr.1.4 somewhere? Wiki 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StandardTokenizerFactory
is
very short..

--
Regards,

Dmitry Kan


Re: solr.StandardTokenizerFactory: more info needed

2011-07-06 Thread Dmitry Kan
Hi Steven,

This looks very good. Thanks. Do I understand correctly, that I were to
change the tokenizer rules, I could go and change e.g. the token class
definitions (like NUM) in this file and recompile the code?

On Wed, Jul 6, 2011 at 3:45 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Dmitry,

 The underlying Lucene implementation is here:
 http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_1/src/java/org/apache/lucene/analysis/standard/

 StandardTokenizerImpl.jflex is probably where you should start.

 Steve

 -Original Message-
 From: Dmitry Kan [mailto:dmitry@gmail.com]
 Sent: Wednesday, July 06, 2011 3:23 AM
 To: solr-user@lucene.apache.org
 Subject: solr.StandardTokenizerFactory: more info needed

 Hi all!

 solr.StandardTokenizerFactory -- is it possible to see the full description
 of its behaviour for solr.1.4 somewhere? Wiki
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.StandardTokenizerFactory
 is
 very short..

 --
 Regards,

 Dmitry Kan




-- 
Regards,

Dmitry Kan


RE: solr.StandardTokenizerFactory: more info needed

2011-07-06 Thread Steven A Rowe
Yes, you can change the rules and recompile.

Before you recompile, you have to run 'ant jflex' to  generate the java source.

Steve

-Original Message-
From: Dmitry Kan [mailto:dmitry@gmail.com] 
Sent: Wednesday, July 06, 2011 9:21 AM
To: solr-user@lucene.apache.org
Subject: Re: solr.StandardTokenizerFactory: more info needed

Hi Steven,

This looks very good. Thanks. Do I understand correctly, that I were to change 
the tokenizer rules, I could go and change e.g. the token class definitions 
(like NUM) in this file and recompile the code?

On Wed, Jul 6, 2011 at 3:45 PM, Steven A Rowe sar...@syr.edu wrote:

 Hi Dmitry,

 The underlying Lucene implementation is here:
 http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_1/src/java
 /org/apache/lucene/analysis/standard/

 StandardTokenizerImpl.jflex is probably where you should start.

 Steve

 -Original Message-
 From: Dmitry Kan [mailto:dmitry@gmail.com]
 Sent: Wednesday, July 06, 2011 3:23 AM
 To: solr-user@lucene.apache.org
 Subject: solr.StandardTokenizerFactory: more info needed

 Hi all!

 solr.StandardTokenizerFactory -- is it possible to see the full 
 description of its behaviour for solr.1.4 somewhere? Wiki 
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.Stand
 ardTokenizerFactory
 is
 very short..

 --
 Regards,

 Dmitry Kan




--
Regards,

Dmitry Kan


Re: solr.StandardTokenizerFactory: more info needed

2011-07-06 Thread Dmitry Kan
OK, thanks. Do you know if there are tokenizer specific tests to run after
compilation?

On Wed, Jul 6, 2011 at 4:25 PM, Steven A Rowe sar...@syr.edu wrote:

 Yes, you can change the rules and recompile.

 Before you recompile, you have to run 'ant jflex' to  generate the java
 source.

 Steve

 -Original Message-
 From: Dmitry Kan [mailto:dmitry@gmail.com]
 Sent: Wednesday, July 06, 2011 9:21 AM
 To: solr-user@lucene.apache.org
 Subject: Re: solr.StandardTokenizerFactory: more info needed

 Hi Steven,

 This looks very good. Thanks. Do I understand correctly, that I were to
 change the tokenizer rules, I could go and change e.g. the token class
 definitions (like NUM) in this file and recompile the code?

 On Wed, Jul 6, 2011 at 3:45 PM, Steven A Rowe sar...@syr.edu wrote:

  Hi Dmitry,
 
  The underlying Lucene implementation is here:
  http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_1/src/java
  /org/apache/lucene/analysis/standard/
 
  StandardTokenizerImpl.jflex is probably where you should start.
 
  Steve
 
  -Original Message-
  From: Dmitry Kan [mailto:dmitry@gmail.com]
  Sent: Wednesday, July 06, 2011 3:23 AM
  To: solr-user@lucene.apache.org
  Subject: solr.StandardTokenizerFactory: more info needed
 
  Hi all!
 
  solr.StandardTokenizerFactory -- is it possible to see the full
  description of its behaviour for solr.1.4 somewhere? Wiki
  http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.Stand
  ardTokenizerFactory
  is
  very short..
 
  --
  Regards,
 
  Dmitry Kan
 



 --
 Regards,

 Dmitry Kan




-- 
Regards,

Dmitry Kan


Re: solr.StandardTokenizerFactory: more info needed

2011-07-06 Thread Erick Erickson
See ..src/test/org/apache/solr/analysis.

But... you'll be changing the grammar, so
I don't know how tests would actually help you. Actually
I'd expect them to break. And you'd have to write some
new ones of your own to exercise your changes to insure
that they do what you want

Best
Erick

On Wed, Jul 6, 2011 at 9:31 AM, Dmitry Kan dmitry@gmail.com wrote:
 OK, thanks. Do you know if there are tokenizer specific tests to run after
 compilation?

 On Wed, Jul 6, 2011 at 4:25 PM, Steven A Rowe sar...@syr.edu wrote:

 Yes, you can change the rules and recompile.

 Before you recompile, you have to run 'ant jflex' to  generate the java
 source.

 Steve

 -Original Message-
 From: Dmitry Kan [mailto:dmitry@gmail.com]
 Sent: Wednesday, July 06, 2011 9:21 AM
 To: solr-user@lucene.apache.org
 Subject: Re: solr.StandardTokenizerFactory: more info needed

 Hi Steven,

 This looks very good. Thanks. Do I understand correctly, that I were to
 change the tokenizer rules, I could go and change e.g. the token class
 definitions (like NUM) in this file and recompile the code?

 On Wed, Jul 6, 2011 at 3:45 PM, Steven A Rowe sar...@syr.edu wrote:

  Hi Dmitry,
 
  The underlying Lucene implementation is here:
  http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_1/src/java
  /org/apache/lucene/analysis/standard/
 
  StandardTokenizerImpl.jflex is probably where you should start.
 
  Steve
 
  -Original Message-
  From: Dmitry Kan [mailto:dmitry@gmail.com]
  Sent: Wednesday, July 06, 2011 3:23 AM
  To: solr-user@lucene.apache.org
  Subject: solr.StandardTokenizerFactory: more info needed
 
  Hi all!
 
  solr.StandardTokenizerFactory -- is it possible to see the full
  description of its behaviour for solr.1.4 somewhere? Wiki
  http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.Stand
  ardTokenizerFactory
  is
  very short..
 
  --
  Regards,
 
  Dmitry Kan
 



 --
 Regards,

 Dmitry Kan




 --
 Regards,

 Dmitry Kan



Re: solr.StandardTokenizerFactory: more info needed

2011-07-06 Thread Dmitry Kan
Thanks, Erick.

On Wed, Jul 6, 2011 at 6:27 PM, Erick Erickson erickerick...@gmail.comwrote:

 See ..src/test/org/apache/solr/analysis.

 But... you'll be changing the grammar, so
 I don't know how tests would actually help you. Actually
 I'd expect them to break. And you'd have to write some
 new ones of your own to exercise your changes to insure
 that they do what you want

 Best
 Erick

 On Wed, Jul 6, 2011 at 9:31 AM, Dmitry Kan dmitry@gmail.com wrote:
  OK, thanks. Do you know if there are tokenizer specific tests to run
 after
  compilation?
 
  On Wed, Jul 6, 2011 at 4:25 PM, Steven A Rowe sar...@syr.edu wrote:
 
  Yes, you can change the rules and recompile.
 
  Before you recompile, you have to run 'ant jflex' to  generate the java
  source.
 
  Steve
 
  -Original Message-
  From: Dmitry Kan [mailto:dmitry@gmail.com]
  Sent: Wednesday, July 06, 2011 9:21 AM
  To: solr-user@lucene.apache.org
  Subject: Re: solr.StandardTokenizerFactory: more info needed
 
  Hi Steven,
 
  This looks very good. Thanks. Do I understand correctly, that I were to
  change the tokenizer rules, I could go and change e.g. the token class
  definitions (like NUM) in this file and recompile the code?
 
  On Wed, Jul 6, 2011 at 3:45 PM, Steven A Rowe sar...@syr.edu wrote:
 
   Hi Dmitry,
  
   The underlying Lucene implementation is here:
  
 http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_9_1/src/java
   /org/apache/lucene/analysis/standard/
  
   StandardTokenizerImpl.jflex is probably where you should start.
  
   Steve
  
   -Original Message-
   From: Dmitry Kan [mailto:dmitry@gmail.com]
   Sent: Wednesday, July 06, 2011 3:23 AM
   To: solr-user@lucene.apache.org
   Subject: solr.StandardTokenizerFactory: more info needed
  
   Hi all!
  
   solr.StandardTokenizerFactory -- is it possible to see the full
   description of its behaviour for solr.1.4 somewhere? Wiki
  
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.Stand
   ardTokenizerFactory
   is
   very short..
  
   --
   Regards,
  
   Dmitry Kan
  
 
 
 
  --
  Regards,
 
  Dmitry Kan
 
 
 
 
  --
  Regards,
 
  Dmitry Kan
 




-- 
Regards,

Dmitry Kan