Just found the rest of the thread. I'll shut up now ;)
sv
On Sun, 14 Mar 2004, Stephane James Vaucher wrote:
Back from a weeks' vacation, so this reply is a little late, maybe out of
order as well ;). Comment inline:
On Tue, 9 Mar 2004, Kevin A. Burton wrote:
Doug Cutting wrote:
I don't buy it. HashSet is but one implementation of a Set. By
choosing the HashSet implementation you are not only tying the class to
a hash-based implementation, you are trying the interface to *that
specific* hash-based implementation or it's subclasses. In the end,
either you buy the
Erik Hatcher wrote:
Yes, I saw it. But is there a reason not to just expose HashSet given
that it is the data structure that is most efficient? I bought into
Kevin's arguments that it made sense to just expose HashSet.
Just the general principal that one shouldn't expose more of the
I will refactor again using Set with no copying this time (except for
the String[] and Hashtable) constructors. This was my original
preference, but I got caught up in the arguments by Kevin and lost my
ideals temporarily :)
I expect to do this later tonight or tomorrow.
Erik
On Mar 11,
Scott ganyo wrote:
I don't buy it. HashSet is but one implementation of a Set. By
choosing the HashSet implementation you are not only tying the class
to a hash-based implementation, you are trying the interface to *that
specific* hash-based implementation or it's subclasses. In the end,
Erik Hatcher wrote:
I will refactor again using Set with no copying this time (except for
the String[] and Hashtable) constructors. This was my original
preference, but I got caught up in the arguments by Kevin and lost my
ideals temporarily :)
I expect to do this later tonight or tomorrow.
Part of the dilemma of which implementation to actually be used will be
solved implicit since our function to construct the Set will return a
HashSet - and this will surely be the method most folks would use. But
I will be sure to note in the Javadoc that the implementation of the
Set is
On Mar 9, 2004, at 10:23 PM, Kevin A. Burton wrote:
You need do make it a HashSet:
table = new HashSet( stopTable.keySet() );
Done.
Also... while you're at it... the private variable name is 'table'
which this HashSet certainly is *not* ;)
Well, depends on your definition of 'table' I suppose
Erik Hatcher wrote:
Also... while you're at it... the private variable name is 'table'
which this HashSet certainly is *not* ;)
Well, depends on your definition of 'table' I suppose :) I changed it
to a type-agnostic stopWords.
Did you know that internally HashSet uses a HashMap?
I sure
On Mar 10, 2004, at 2:59 PM, Kevin A. Burton wrote:
I refuse to expose HashSet... sorry! :) But I did wrap what is
passed in, like above, in a HashSet in my latest commit.
Hm... You're doing this EVEN if the caller passes a HashSet directly?!
Well it was in the ctor. But I guess I'm not seeing
Erik Hatcher wrote:
Also... you're HashSet constructor has to copy values from the
original HashSet into the new HashSet ... not very clean and this can
just be removed by forcing the caller to use a HashSet (which they
should).
I've caved in and gone HashSet all the way.
Did you not see my
Doug Cutting wrote:
Erik Hatcher wrote:
Also... you're HashSet constructor has to copy values from the
original HashSet into the new HashSet ... not very clean and this
can just be removed by forcing the caller to use a HashSet (which
they should).
I've caved in and gone HashSet all the
On Mar 10, 2004, at 10:28 PM, Doug Cutting wrote:
Erik Hatcher wrote:
Also... you're HashSet constructor has to copy values from the
original HashSet into the new HashSet ... not very clean and this
can just be removed by forcing the caller to use a HashSet (which
they should).
I've caved in
I really don't think this will make any noticable difference, but why
not. Could you please send a diff -uN patch, please?
I made the same changes locally about a year ago, but have since thrown
away my local changes (for no good reason that I recall).
Thanks,
Otis
--- Kevin A. Burton [EMAIL
Otis Gospodnetic wrote:
I really don't think this will make any noticable difference, but why
not. Could you please send a diff -uN patch, please?
I made the same changes locally about a year ago, but have since thrown
away my local changes (for no good reason that I recall).
Just diff it
Erik Hatcher wrote:
I don't see any reason for this to be a Hashtable.
It seems an acceptable alternative to not share analyzer/filter
instances across threads - they don't really take up much space, so
is there a reason to share them? Or I'm guessing you're sharing it
implicitly through
Well, one issue you didn't consider is changing a public method
signature. I will make this change, but leave the Hashtable signature
method there. I suppose we could change the signature to use a Map
instead, but I believe there are some issues with doing something like
this if you do not
Erik Hatcher wrote:
Well, one issue you didn't consider is changing a public method
signature. I will make this change, but leave the Hashtable signature
method there. I suppose we could change the signature to use a Map
instead, but I believe there are some issues with doing something like
Maybe I missed something but I always thought the stop list should be a
Set, not a Map (or Hashtable/Dictionary). After all, all you need to
know is existence and that's what a Set does.
Doug Cutting wrote:
Erik Hatcher wrote:
Well, one issue you didn't consider is changing a public method
David Spencer wrote:
Maybe I missed something but I always thought the stop list should be a
Set, not a Map (or Hashtable/Dictionary). After all, all you need to
know is existence and that's what a Set does.
Good point.
Doug
-
Doug Cutting wrote:
Erik Hatcher wrote:
Well, one issue you didn't consider is changing a public method
signature. I will make this change, but leave the Hashtable
signature method there. I suppose we could change the signature to
use a Map instead, but I believe there are some issues with
David Spencer wrote:
Maybe I missed something but I always thought the stop list should be
a Set, not a Map (or Hashtable/Dictionary). After all, all you need to
know is existence and that's what a Set does.
It stores the word as the key and the value...
I don't care either way... There was no
Doug Cutting wrote:
Erik Hatcher wrote:
Well, one issue you didn't consider is changing a public method
signature. I will make this change, but leave the Hashtable
signature method there. I suppose we could change the signature to
use a Map instead, but I believe there are some issues with
Doug Cutting wrote:
David Spencer wrote:
Maybe I missed something but I always thought the stop list should be
a Set, not a Map (or Hashtable/Dictionary). After all, all you need
to know is existence and that's what a Set does.
Good point.
It's easy to migrate to a HashSet... either way...
Kevin - I've made this change and committed it, using a Set.
Let me know if there are any issues with what I've committed - I
believe I've faithfully preserved backwards compatibility.
Erik
p.s. ...
On Mar 9, 2004, at 2:00 PM, Kevin A. Burton wrote:
public StopFilter(TokenStream in,
This would no longer compile with the change Kevin proposes.
To make things back-compatible we must:
1. Keep but deprectate StopFilter(Hashtable) constructor;
2. Keep but deprecate StopFilter.makeStopTable(String[]);
3. Add a new constructor: StopFilter(HashMap);
If you'd use
Erik Hatcher wrote:
Kevin - I've made this change and committed it, using a Set.
Let me know if there are any issues with what I've committed - I
believe I've faithfully preserved backwards compatibility.
Great... I'll take a look!
p.s. ...
On Mar 9, 2004, at 2:00 PM, Kevin A. Burton wrote:
Erik Hatcher wrote:
Kevin - I've made this change and committed it, using a Set.
Let me know if there are any issues with what I've committed - I
believe I've faithfully preserved backwards compatibility.
Actually... Erik.. I don't think your Hashtable constructor will work...
By default
I'm looking at StopFilter.java right now...
I did a kill -3 java and a number of my threads were blocked here:
ksa-task-thread-34 prio=1 tid=0xad89fbe8 nid=0x1c6e waiting for
monitor entry [b9bff000..b9bff8d0]
at java.util.Hashtable.get(Hashtable.java:332)
- waiting to lock
I don't see any reason for this to be a Hashtable.
It seems an acceptable alternative to not share analyzer/filter
instances across threads - they don't really take up much space, so is
there a reason to share them? Or I'm guessing you're sharing it
implicitly through an IndexWriter, huh?
30 matches
Mail list logo