Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-14 Thread Stephane James Vaucher
Just found the rest of the thread. I'll shut up now ;) sv On Sun, 14 Mar 2004, Stephane James Vaucher wrote: Back from a weeks' vacation, so this reply is a little late, maybe out of order as well ;). Comment inline: On Tue, 9 Mar 2004, Kevin A. Burton wrote: Doug Cutting wrote:

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-11 Thread Scott ganyo
I don't buy it. HashSet is but one implementation of a Set. By choosing the HashSet implementation you are not only tying the class to a hash-based implementation, you are trying the interface to *that specific* hash-based implementation or it's subclasses. In the end, either you buy the

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-11 Thread Doug Cutting
Erik Hatcher wrote: Yes, I saw it. But is there a reason not to just expose HashSet given that it is the data structure that is most efficient? I bought into Kevin's arguments that it made sense to just expose HashSet. Just the general principal that one shouldn't expose more of the

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-11 Thread Erik Hatcher
I will refactor again using Set with no copying this time (except for the String[] and Hashtable) constructors. This was my original preference, but I got caught up in the arguments by Kevin and lost my ideals temporarily :) I expect to do this later tonight or tomorrow. Erik On Mar 11,

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-11 Thread Kevin A. Burton
Scott ganyo wrote: I don't buy it. HashSet is but one implementation of a Set. By choosing the HashSet implementation you are not only tying the class to a hash-based implementation, you are trying the interface to *that specific* hash-based implementation or it's subclasses. In the end,

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-11 Thread Kevin A. Burton
Erik Hatcher wrote: I will refactor again using Set with no copying this time (except for the String[] and Hashtable) constructors. This was my original preference, but I got caught up in the arguments by Kevin and lost my ideals temporarily :) I expect to do this later tonight or tomorrow.

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-11 Thread Erik Hatcher
Part of the dilemma of which implementation to actually be used will be solved implicit since our function to construct the Set will return a HashSet - and this will surely be the method most folks would use. But I will be sure to note in the Javadoc that the implementation of the Set is

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-10 Thread Erik Hatcher
On Mar 9, 2004, at 10:23 PM, Kevin A. Burton wrote: You need do make it a HashSet: table = new HashSet( stopTable.keySet() ); Done. Also... while you're at it... the private variable name is 'table' which this HashSet certainly is *not* ;) Well, depends on your definition of 'table' I suppose

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-10 Thread Kevin A. Burton
Erik Hatcher wrote: Also... while you're at it... the private variable name is 'table' which this HashSet certainly is *not* ;) Well, depends on your definition of 'table' I suppose :) I changed it to a type-agnostic stopWords. Did you know that internally HashSet uses a HashMap? I sure

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-10 Thread Erik Hatcher
On Mar 10, 2004, at 2:59 PM, Kevin A. Burton wrote: I refuse to expose HashSet... sorry! :) But I did wrap what is passed in, like above, in a HashSet in my latest commit. Hm... You're doing this EVEN if the caller passes a HashSet directly?! Well it was in the ctor. But I guess I'm not seeing

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-10 Thread Doug Cutting
Erik Hatcher wrote: Also... you're HashSet constructor has to copy values from the original HashSet into the new HashSet ... not very clean and this can just be removed by forcing the caller to use a HashSet (which they should). I've caved in and gone HashSet all the way. Did you not see my

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-10 Thread Kevin A. Burton
Doug Cutting wrote: Erik Hatcher wrote: Also... you're HashSet constructor has to copy values from the original HashSet into the new HashSet ... not very clean and this can just be removed by forcing the caller to use a HashSet (which they should). I've caved in and gone HashSet all the

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-10 Thread Erik Hatcher
On Mar 10, 2004, at 10:28 PM, Doug Cutting wrote: Erik Hatcher wrote: Also... you're HashSet constructor has to copy values from the original HashSet into the new HashSet ... not very clean and this can just be removed by forcing the caller to use a HashSet (which they should). I've caved in

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Otis Gospodnetic
I really don't think this will make any noticable difference, but why not. Could you please send a diff -uN patch, please? I made the same changes locally about a year ago, but have since thrown away my local changes (for no good reason that I recall). Thanks, Otis --- Kevin A. Burton [EMAIL

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
Otis Gospodnetic wrote: I really don't think this will make any noticable difference, but why not. Could you please send a diff -uN patch, please? I made the same changes locally about a year ago, but have since thrown away my local changes (for no good reason that I recall). Just diff it

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
Erik Hatcher wrote: I don't see any reason for this to be a Hashtable. It seems an acceptable alternative to not share analyzer/filter instances across threads - they don't really take up much space, so is there a reason to share them? Or I'm guessing you're sharing it implicitly through

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Erik Hatcher
Well, one issue you didn't consider is changing a public method signature. I will make this change, but leave the Hashtable signature method there. I suppose we could change the signature to use a Map instead, but I believe there are some issues with doing something like this if you do not

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Doug Cutting
Erik Hatcher wrote: Well, one issue you didn't consider is changing a public method signature. I will make this change, but leave the Hashtable signature method there. I suppose we could change the signature to use a Map instead, but I believe there are some issues with doing something like

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread David Spencer
Maybe I missed something but I always thought the stop list should be a Set, not a Map (or Hashtable/Dictionary). After all, all you need to know is existence and that's what a Set does. Doug Cutting wrote: Erik Hatcher wrote: Well, one issue you didn't consider is changing a public method

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Doug Cutting
David Spencer wrote: Maybe I missed something but I always thought the stop list should be a Set, not a Map (or Hashtable/Dictionary). After all, all you need to know is existence and that's what a Set does. Good point. Doug -

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
Doug Cutting wrote: Erik Hatcher wrote: Well, one issue you didn't consider is changing a public method signature. I will make this change, but leave the Hashtable signature method there. I suppose we could change the signature to use a Map instead, but I believe there are some issues with

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
David Spencer wrote: Maybe I missed something but I always thought the stop list should be a Set, not a Map (or Hashtable/Dictionary). After all, all you need to know is existence and that's what a Set does. It stores the word as the key and the value... I don't care either way... There was no

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
Doug Cutting wrote: Erik Hatcher wrote: Well, one issue you didn't consider is changing a public method signature. I will make this change, but leave the Hashtable signature method there. I suppose we could change the signature to use a Map instead, but I believe there are some issues with

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
Doug Cutting wrote: David Spencer wrote: Maybe I missed something but I always thought the stop list should be a Set, not a Map (or Hashtable/Dictionary). After all, all you need to know is existence and that's what a Set does. Good point. It's easy to migrate to a HashSet... either way...

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Erik Hatcher
Kevin - I've made this change and committed it, using a Set. Let me know if there are any issues with what I've committed - I believe I've faithfully preserved backwards compatibility. Erik p.s. ... On Mar 9, 2004, at 2:00 PM, Kevin A. Burton wrote: public StopFilter(TokenStream in,

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Incze Lajos
This would no longer compile with the change Kevin proposes. To make things back-compatible we must: 1. Keep but deprectate StopFilter(Hashtable) constructor; 2. Keep but deprecate StopFilter.makeStopTable(String[]); 3. Add a new constructor: StopFilter(HashMap); If you'd use

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
Erik Hatcher wrote: Kevin - I've made this change and committed it, using a Set. Let me know if there are any issues with what I've committed - I believe I've faithfully preserved backwards compatibility. Great... I'll take a look! p.s. ... On Mar 9, 2004, at 2:00 PM, Kevin A. Burton wrote:

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-09 Thread Kevin A. Burton
Erik Hatcher wrote: Kevin - I've made this change and committed it, using a Set. Let me know if there are any issues with what I've committed - I believe I've faithfully preserved backwards compatibility. Actually... Erik.. I don't think your Hashtable constructor will work... By default

DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-08 Thread Kevin A. Burton
I'm looking at StopFilter.java right now... I did a kill -3 java and a number of my threads were blocked here: ksa-task-thread-34 prio=1 tid=0xad89fbe8 nid=0x1c6e waiting for monitor entry [b9bff000..b9bff8d0] at java.util.Hashtable.get(Hashtable.java:332) - waiting to lock

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-08 Thread Erik Hatcher
I don't see any reason for this to be a Hashtable. It seems an acceptable alternative to not share analyzer/filter instances across threads - they don't really take up much space, so is there a reason to share them? Or I'm guessing you're sharing it implicitly through an IndexWriter, huh?