Re: FastSSFuzzy for faster fuzzy queries in Lucene

2009-01-05 Thread Robert Muir
got a list of people's names we want to do spellchecking on for example. -J -- Robert Muir rcm...@gmail.com

Re: FastSSFuzzy for faster fuzzy queries in Lucene

2009-01-06 Thread Robert Muir
bo...@ifi.uzh.ch wrote: Hi Robert, Robert Muir wrote: hi, I'm actually working on doing just this (though I haven't created a jira ticket). the way i have it working is by creating a secondary lucene index. the size of this secondary index is determined primarily by number of unique

fastssfuzzy code

2009-01-06 Thread Robert Muir
be helpful :) -- Robert Muir rcm...@gmail.com

Re: Partial / starts with searching

2009-02-13 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Hebrew and Hindi analyzers

2009-02-17 Thread Robert Muir
commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Regarding ArabicLetterTokenizer and the StandardTokenizer - best of both worlds!

2009-02-20 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: 2.3.2 - 2.4.0 StandardTokenizer issue

2009-02-21 Thread Robert Muir
instead of 0..1 conversions we'd be doing 1..2 conversions during indexing and searching. -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Saturday, February 21, 2009 8:35 AM To: java-user@lucene.apache.org Subject: Re: 2.3.2 - 2.4.0 StandardTokenizer issue normalize

Re: i18n numbers

2009-03-26 Thread Robert Muir
...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: i18n numbers

2009-03-27 Thread Robert Muir
. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com -- View this message in context: http://www.nabble.com/i18n-numbers-tp22731528p22736807.html Sent

Re: Encoding detection free software?

2009-03-27 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Faceting, Sort and DocIDSet

2009-04-20 Thread Robert Muir
...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread Robert Muir
or ArabicAnalyzer), can it still able to handle it correctly if user info is mixed with English character/word? Really appreciated with any answers. :-) -- Robert Muir rcm...@gmail.com

Re: Question wrt Lucene analyzer for different language

2009-05-14 Thread Robert Muir
and English? :-) I am not really familiar with Arabic language. What do you mean for change Arabic tokens? Does Arabic has something like upper/lower case as English does? On Thu, May 14, 2009 at 10:47 AM, Robert Muir rcm...@gmail.com wrote: in the case of ArabicAnalyzer it will only change Arabic

Re: How to query/search unicoded docs in lucene using unicode text as query?

2009-05-21 Thread Robert Muir
is going wrong here? May be I've to have a look over the analyzer solr was using in the default setting[i used the default setting only, and pretty sure it was using lot many analyzers/filter factory]. Thanks for all your time and appreciation. Thanks, KK. -- Robert Muir rcm...@gmail.com

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-03 Thread Robert Muir
metion that we dont have stemming and case folding for these non-english content. I'm stuck with this. Some one do let me know how to proceed for fixing this issue. Thanks, KK. -- Robert Muir rcm...@gmail.com

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-04 Thread Robert Muir
way of approaching the problem? Any thoughts! Thanks, KK. On Wed, Jun 3, 2009 at 9:42 PM, Robert Muir rcm...@gmail.com wrote: KK, is all of your latin script text actually english? Is there stuff like german or french mixed in? And for your non-english content (your examples have

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-04 Thread Robert Muir
eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 04, 2009 1:18 PM To: java-user@lucene.apache.org Subject: Re: How to support stemming and case folding for english content mixed with non-english content? KK, ok, so

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-04 Thread Robert Muir
to lucene and know basics of Java coding. Thank you very much. --KK. On Thu, Jun 4, 2009 at 5:30 PM, Robert Muir rcm...@gmail.com wrote: yes this is true. for starters KK, might be good to startup solr and look at http://localhost:8983/solr/admin/analysis.jsp?highlight=on if you want

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-04 Thread Robert Muir
-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-04 Thread Robert Muir
be that difficult. BTW, can you point me to some sample codes/tutorials writing custom analyzers. I could not find something in LIA2ndEdn. Is something htere? do let me know. Thanks, KK. On Thu, Jun 4, 2009 at 6:19 PM, Robert Muir rcm...@gmail.com wrote: KK, for your case, you don't really need

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
this sound OK? I think it will do the job...let me try it out.. I dont need custom filter as per my requirement, at least not for these basic things I'm doing? I think so... Thanks, KK. On Thu, Jun 4, 2009 at 6:36 PM, Robert Muir rcm...@gmail.com wrote: KK well you can always get some good

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
a lot. KK On Fri, Jun 5, 2009 at 5:30 PM, Robert Muir rcm...@gmail.com wrote: i think you are on the right track... once you build your analyzer, put it in your classpath and play around with it in luke and see if it does what you want. On Fri, Jun 5, 2009 at 3:19 AM, KK

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
the use of the other one. Anyway can you guide me getting rid of the above error. And yes I'll change the order of applying the filters as you said. Thanks, KK. On Fri, Jun 5, 2009 at 5:48 PM, Robert Muir rcm...@gmail.com wrote: KK, you got the right idea. though I think you

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-05 Thread Robert Muir
me getting rid of the above error. And yes I'll change the order of applying the filters as you said. Thanks, KK. On Fri, Jun 5, 2009 at 5:48 PM, Robert Muir rcm...@gmail.com wrote: KK, you got the right idea. though I think you might want to change the order, move the stopfilter

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-06 Thread Robert Muir
to get rid of that? Any thougts? It seems setting those values is some proper way might fix the problem, I'm not sure, though. Thanks, KK. On Fri, Jun 5, 2009 at 7:37 PM, Robert Muir rcm...@gmail.com wrote: kk an easier solution to your first problem is to use worddelimiterfilterfactory

Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-08 Thread Robert Muir
, Robert Muir rcm...@gmail.com wrote: kk, i haven't had that experience with worddelimiterfilter on indian languages, is it possible you could provide me an example of how its creating nuisance? On Sat, Jun 6, 2009 at 9:42 AM, KKdioxide.softw...@gmail.com wrote: Robert, I tried to use

Re: Lucene and multi-lingual Unicode - advice needed

2009-06-15 Thread Robert Muir
? Thanks in advance for advices :) -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Lucene and multi-lingual Unicode - advice needed

2009-06-15 Thread Robert Muir
that encapsulates as much low level indexing/search technology as possible and have it integrate nicely with Spring. It looked like Compass was/is a good encapsulation of the functionality. I'll take a look at SolR though, thanks for the pointer. -Original Message- From: Robert Muir [mailto:rcm

Re: Lucene and multi-lingual Unicode - advice needed

2009-06-15 Thread Robert Muir
what is required (at a minimum) to build an analyzer, sandbox has a few of them varying in complexity. -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, June 15, 2009 4:51 PM To: java-user@lucene.apache.org Subject: Re: Lucene and multi-lingual Unicode

Re: Lucene and multi-lingual Unicode - advice needed

2009-06-15 Thread Robert Muir
Xhosa Yiddish Yoruba Zulu -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, June 15, 2009 5:56 PM To: java-user@lucene.apache.org Subject: Re: Lucene and multi-lingual Unicode - advice needed its not too bad, here would be a simple one that only breaks words

Re: Lucene and multi-lingual Unicode - advice needed

2009-06-15 Thread Robert Muir
of Eastern and Eastern European ones and of course CJK. -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, June 15, 2009 9:52 PM To: java-user@lucene.apache.org Subject: Re: Lucene and multi-lingual Unicode - advice needed Really, you have a requirement

Re: Lucene and multi-lingual Unicode - advice needed

2009-06-16 Thread Robert Muir
-- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Hindi, diacritics and search results

2009-07-10 Thread Robert Muir
) that includes that letter. Any comments much appreciated. Thanks. -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h

Re: Hindi, diacritics and search results

2009-07-10 Thread Robert Muir
that would be used in Lucene by default. Which one should I use? -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, July 10, 2009 6:13 PM To: java-user@lucene.apache.org Subject: Re: Hindi, diacritics and search results Which analyzer in particular are you using

Re: Search in non-linguistic text

2009-07-16 Thread Robert Muir
. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e

Re: question on custom filter

2009-07-20 Thread Robert Muir
the right numbers for left to right languages and it is a bit more challenging to do it for right to left ones but for mixed text it is quite hard. Thanks. -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java

Re: question on custom filter

2009-07-20 Thread Robert Muir
something? -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, July 20, 2009 1:43 PM To: java-user@lucene.apache.org Subject: Re: question on custom filter Obender, I don't think its as difficult as you think. Your filter does not need to be aware of this issue

Re: question on custom filter

2009-07-20 Thread Robert Muir
: This is how it should be written: http://unicode.org/cldr/utility/transform.jsp?a=nameb=%D7%A2%D6%B6%D7%A8%D6%B6%D7%91+%D7%98%D7%95%D6%B9%D7%91 -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, July 20, 2009 2:07 PM To: java-user@lucene.apache.org Subject

Re: question on custom filter

2009-07-20 Thread Robert Muir
)        {                TokenStream ts  = new WhitespaceTokenizer( reader );                ts                      = new XFilter( ts );                return ts;        } } -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, July 20, 2009 2:26 PM To: java-user@lucene.apache.org

Re: question on custom filter

2009-07-20 Thread Robert Muir
;        } } -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, July 20, 2009 2:26 PM To: java-user@lucene.apache.org Subject: Re: question on custom filter Obender, I think something in your environment / display environment might be causing some confusion. Are you using

Re: question on custom filter

2009-07-20 Thread Robert Muir
: Interesting, the question now is why am I seeing (even in println) what I'm seeing :) I'm reading a string from the file which is in UTF-8 encoding. Could this somehow be related...? -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, July 20, 2009 3:03 PM To: java

Re: question on custom filter

2009-07-20 Thread Robert Muir
- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, July 20, 2009 3:18 PM To: java-user@lucene.apache.org Subject: Re: question on custom filter Obender, based on your previous comments (that you see text displayed in the wrong order), I again recommend that you enable support for RTL

Re: arabic analyzer

2009-07-23 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr

Re: arabic analyzer

2009-07-24 Thread Robert Muir
one, we only got matches for: | فّ فُ فٌ فف فِِ فٍ ف  and the likes of that. -walid On Thu, 2009-07-23 at 09:33 -0400, Robert Muir wrote: walid, can you provide any more information other than very poor result? Others have not measured much difference between morphological analysis

Re: Quick question about Lucene and UCS4

2009-07-31 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Quick question about Lucene and UCS4

2009-07-31 Thread Robert Muir
...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Quick question about Lucene and UCS4

2009-07-31 Thread Robert Muir
-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org

Re: arabic analyzer

2009-08-02 Thread Robert Muir
that. -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: arabic analyzer

2009-08-03 Thread Robert Muir
of the insert type and not the prefix/suffix. thank you :) -walid On Sun, 2009-08-02 at 15:08 -0400, Robert Muir wrote: the fact is, plural (as an example) is not supported, and that is one of the most common things that a person doing some search will expect to Walid, I'm not sure

Re: Language Detection for Analysis?

2009-08-06 Thread Robert Muir
on language detection so we can figure out what analyzers to use? Are there commercial solutions? Much appreciated! -- http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science -- Robert Muir rcm...@gmail.com

Re: Language Detection for Analysis?

2009-08-06 Thread Robert Muir
it at the script level? On Thu, Aug 6, 2009 at 10:55 PM, Robert Muir rcm...@gmail.com wrote: Bradford, there is an arabic analyzer in trunk. for farsi there is currently a patch available: http://issues.apache.org/jira/browse/LUCENE-1628 one option is not to detect languages at all. it could

Re: Any Tokenizator friendly to C++, C#, .NET, etc ?

2009-08-20 Thread Robert Muir
-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Any Tokenizator friendly to C++, C#, .NET, etc ?

2009-08-20 Thread Robert Muir
for this task?.. regards Valery Robert Muir wrote: Valery, One thing you could try would be to create a JFlex-based tokenizer, specifying a grammar with the rules you want. You could use the source code grammar of StandardTokenizer as a starting point. On Thu, Aug 20, 2009 at 10:28 AM

Re: Any Tokenizator friendly to C++, C#, .NET, etc ?

2009-08-21 Thread Robert Muir
list archive at Nabble.com. - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Can this regex be done?

2009-09-02 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user

Re: Can this regex be done?

2009-09-03 Thread Robert Muir
-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e

Re: Best way to create own version of StandardTokenizer ?

2009-09-04 Thread Robert Muir
see a list of the changes here: http://www.unicode.org/versions/Unicode4.0.0/ -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h

Re: Best way to create own version of StandardTokenizer ?

2009-09-04 Thread Robert Muir
, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Best way to create own version of StandardTokenizer ?

2009-09-04 Thread Robert Muir
they can be incorporated in (maybe not there, but somewhere). On Fri, Sep 4, 2009 at 3:41 PM, Paul Taylor paul_t...@fastmail.fm wrote: Robert Muir wrote: Paul, thanks for the examples. In my opinion, only one of these is a tokenizer problem :) none of these will be affected by a unicode upgrade

Re: Best way to create own version of StandardTokenizer ?

2009-09-07 Thread Robert Muir
#Inaccessible_punctuation -- Robert Muir rcm...@gmail.com - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Best way to create own version of StandardTokenizer ?

2009-09-07 Thread Robert Muir
On Mon, Sep 7, 2009 at 10:47 AM, Paul Taylor paul_t...@fastmail.fm wrote: Robert Muir wrote: I think we would like to implement the complete unicode rules, so if you could provide us with some code that would be great. ok, I will followup... what version of lucene are you using, 2.9

Re: Search with whitespaces

2009-09-25 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Alex -- Robert Muir rcm...@gmail.com

Re: Problem searching non analyzed fields

2009-09-29 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Search with whitespaces

2009-09-29 Thread Robert Muir
...@gmail.comwrote: To use ShingleFilter, I'd like to change its TOKEN_SEPARATOR, but it's final. Furthermore, I tryed to compile its source code but the compiler isn't finding some methods like addAtribute. Does someone know how could I do that? Alex On Fri, Sep 25, 2009 at 2:42 PM, Robert Muir rcm

Re: Parsing Error while indexing in Lucene WordNet package

2009-10-21 Thread Robert Muir
, but wordnet package included in it still has the same problem given above.* -- Parag H. Dave -- Robert Muir rcm...@gmail.com

Re: Using org.apache.lucene.analysis.compound

2009-10-21 Thread Robert Muir
as a concept than, say, Fleischüberwachung? (again, that could be based on a dictionary probably). thanks in advance paul Le 21-oct.-09 à 04:00, Robert Muir a écrit : hi, it will work because it will also decompound Rindfleish into Rind and fleish, with posIncr=0 so if you index

Re: Parsing Error while indexing in Lucene WordNet package

2009-10-21 Thread Robert Muir
tycoon *Recently lucene-2.9.0 has been released, but wordnet package included in it still has the same problem given above.* -- Parag H. Dave -- Robert Muir rcm...@gmail.com

Re: Using org.apache.lucene.analysis.compound

2009-10-21 Thread Robert Muir
mechanism that would rank überwachungsgesetz higher than gesetzüberwachung or fleischgesetz? -- Robert Muir rcm...@gmail.com

Re: Using org.apache.lucene.analysis.compound

2009-10-21 Thread Robert Muir
:09 PM, Benjamin Douglas bbdoug...@basistech.comwrote: OK, that makes sense. So I just need to add all of the sub-compounds that are real words at posIncr=0, even if they are combinations of other sub-compounds. Thanks! -Original Message- From: Robert Muir [mailto:rcm...@gmail.com

Re: Using org.apache.lucene.analysis.compound

2009-10-21 Thread Robert Muir
or? paul Le 21-oct.-09 à 21:09, Benjamin Douglas a écrit : OK, that makes sense. So I just need to add all of the sub-compounds that are real words at posIncr=0, even if they are combinations of other sub-compounds. Thanks! -Original Message- From: Robert Muir [mailto:rcm

Re: Using org.apache.lucene.analysis.compound

2009-10-21 Thread Robert Muir
, Paul Libbrecht p...@activemath.org wrote: Great, now the next question: which dictionary to do you guys use? How big can it be? Is 5 words acceptable? paul Le 21-oct.-09 à 21:23, Robert Muir a écrit : Paul, i think in general scoring should take care of this too, its all about

Re: Split single string into several fields?

2009-10-27 Thread Robert Muir
-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: IO exception during merge/optimize

2009-10-28 Thread Robert Muir
: - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: IO exception during merge/optimize

2009-10-28 Thread Robert Muir
thats exactly the result I saw FWIW On Wed, Oct 28, 2009 at 11:25 AM, Michael McCandless luc...@mikemccandless.com wrote: Right, I would expect Lucene would silently truncate the term at the U+, and not lead to this odd exception. Mike On Wed, Oct 28, 2009 at 11:23 AM, Robert Muir rcm

Re: scoring adjacent terms without proximity search

2009-10-30 Thread Robert Muir
...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: scoring adjacent terms without proximity search

2009-10-30 Thread Robert Muir
injecting additional tokens will cause more harm than good, because it has an adverse affect on lengthnorm. -- Robert Muir rcm...@gmail.com

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
repost. Thanks, Mark -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 -- Robert Muir rcm...@gmail.com

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
:01 PM, Mark Bennett mbenn...@ideaeng.com wrote: Hello Robert, On Mon, Nov 9, 2009 at 12:34 PM, Robert Muir rcm...@gmail.com wrote: Mark, has there been any change to the LGPL dependency? On Mon, Nov 9, 2009 at 2:55 PM, Mark Bennett mbenn...@ideaeng.com wrote: The only code I'm

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
. Open Source License Incompatibility Issues, hm... -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Mon, Nov 9, 2009 at 1:07 PM, Robert Muir rcm...@gmail.com wrote: Mark, I think my concern is that Sen itself

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
the author. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Mon, Nov 9, 2009 at 2:49 PM, Robert Muir rcm...@gmail.com wrote: Hi Mark, I think apache 2.0 would be easiest. But I think BSD also works. Its

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
trying to minimize the hassle factor for him. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Mon, Nov 9, 2009 at 2:56 PM, Robert Muir rcm...@gmail.com wrote: if he is ok with it i think we need to setup

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
served on the dev list? I've been a bit confused about that in the past (on the similar named Solr lists) Mark -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Mon, Nov 9, 2009 at 3:13 PM, Robert Muir rcm

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
Marvin, in this case its the same folks: https://sen.dev.java.net/servlets/ProjectDocumentList?folderID=755expandFolder=755folderID=0 ... dunno if that matters On Mon, Nov 9, 2009 at 7:02 PM, Marvin Humphrey mar...@rectangular.comwrote: On Mon, Nov 09, 2009 at 04:07:55PM -0500, Robert Muir

Re: Questions about SEN patch submissions

2009-11-09 Thread Robert Muir
, Robert Muir wrote: Mark, I think my concern is that Sen itself is LGPL ( https://sen.dev.java.net/). this lucene-ja is just a lucene interface to this LGPL library. I think this dependency might be a problem, but I am not the expert: http://www.apache.org/legal/resolved.html#category

Re: Redundant fields Token class?

2009-11-13 Thread Robert Muir
...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert

Re: Redundant fields Token class?

2009-11-13 Thread Robert Muir
. On Fri, Nov 13, 2009 at 4:20 PM, Robert Muir rcm...@gmail.com wrote: Another example is if you used a stemmer, it might change the termLength: (walking - walk), but the offsets of the original unstemmed word (walking) stay the same. On Fri, Nov 13, 2009 at 6:01 PM, Uwe Schindler u

Re: Polishing up my Lucene integration, customizing analyzer

2009-11-15 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: ChainedFilter in Lucene 2.9

2009-11-19 Thread Robert Muir
. But in Lucene 2.9, I can't find ChainedFilter anywhere. Is there still a way to do this? It's crucial for my application. Thanks! - Mike aka...@gmail.com -- Robert Muir rcm...@gmail.com

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-11-30 Thread Robert Muir
- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-11-30 Thread Robert Muir
as LowerCaseFilter. -- Robert Muir rcm...@gmail.com

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-11-30 Thread Robert Muir
For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-11-30 Thread Robert Muir
noticed CKF creates a String out of the char[]. If the code already does that, why not use String.toLowerCase(Locale)? Shai On Mon, Nov 30, 2009 at 9:46 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Mon, Nov 30, 2009 at 8:08 PM, Robert Muir rcm...@gmail.com wrote: I am

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-11-30 Thread Robert Muir
}); assertAnalyzesTo(analyzer, AĞACI, new String[] { ağaci }); } -- Robert Muir rcm...@gmail.com

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-11-30 Thread Robert Muir
included lots of folding tables, except for http://unicode.org/Public/UNIDATA/CaseFolding.txt. I guess I counted on LowerCaseFilter too much. Is that the table you're working w/ in LUCENE-1488? I assume you use more of course :) Shai On Mon, Nov 30, 2009 at 10:00 PM, Robert Muir rcm

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-11-30 Thread Robert Muir
'go look in the stringbuffer') Also, in ICUCaseFoldingFilter, I believe termAtt can be declared final? yeah, probably some other things too, thanks :) Thanks, Shai On Mon, Nov 30, 2009 at 10:46 PM, Robert Muir rcm...@gmail.com wrote: Shai, no, behind the scenes I am using just

Re: LowerCaseFilter fails one letter (I) of Turkish alphabet

2009-12-01 Thread Robert Muir
want. Thank you for your consideration. Ahmet - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: Lucene-3.0.0 web demo problem

2009-12-07 Thread Robert Muir
, Brian - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org -- Robert Muir rcm...@gmail.com

Re: How to do alias(Pinyin) search in Lucene

2009-12-15 Thread Robert Muir
containing 中国 or even Chinese Anybody here know how to achieve this? -- Weiwei Wang Alex Wang 王巍巍 Room 403, Mengmin Wei Building Computer Science Department Gulou Campus of Nanjing University Nanjing, P.R.China, 210093 Homepage: http://cs.nju.edu.cn/rl/weiweiwang -- Robert Muir

Re: How to do alias(Pinyin) search in Lucene

2009-12-15 Thread Robert Muir
look at the latest patch file attached to the issue, it should work with lucene 2.9 or greater (I think) 2009/12/15 Weiwei Wang ww.wang...@gmail.com where can i find the source code? On Tue, Dec 15, 2009 at 9:40 PM, Robert Muir rcm...@gmail.com wrote: there is an icu transform tokenfilter

Re: How to do alias(Pinyin) search in Lucene

2009-12-15 Thread Robert Muir
PM, Robert Muir rcm...@gmail.com wrote: look at the latest patch file attached to the issue, it should work with lucene 2.9 or greater (I think) 2009/12/15 Weiwei Wang ww.wang...@gmail.com where can i find the source code? On Tue, Dec 15, 2009 at 9:40 PM, Robert Muir rcm

  1   2   3   4   5   6   >