[jira] Resolved: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
! > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https://issues.apache.org/jira/browse/LUCENE-2216 > Project: Lucene - Java >

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
g as they depend a lot on how you test, but I can do it out of sheer curiosity - will report tomorrow. Cool. I'd recommend testing in the context of OpenBitSet (i.e. don't try testing ntz directly). Perhaps just create a large random set (~1M bits) with a certain percent of bits

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
the same as in hacker's delight, but it isn't. Microbenchmarks will always be misleading as they depend a lot on how you test, but I can do it out of sheer curiosity -- will report tomorrow. > OpenBitSet#hashCode() may return false f

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
imizations and different implementations before I settled on the one used in BitUtil, so it would be nice to do some benchmarks to see if it's truly faster now (and also what the performance difference is for users of JVMs before this optimization was implemented). > OpenBitSet#hashCode(

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12801269#action_12801269 ] Dawid Weiss commented on LUCENE-2216: - Ok, argument accepted. > OpenBitSet#h

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
e for the code doing redundant checking you didn't want. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https://issues.apache.org/jira/browse

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
they are used. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https://issues.apache.org/jira/browse/LUCENE-2216 > Project: Lucene -

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
len every time the tail changes, or make explicit changes to the documentation that inform about suboptimal performance for zero-tailed sets). > OpenBitSet#hashCode() may return false for identical sets. > -- > >

[jira] Issue Comment Edited: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
aders to read the set as long as a writer wasn't writing it. But equals and hashCode would need to be categorized under "write" methods for this to work... (definitely unexpected) otherwise all sorts

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
y unexpected) otherwise all sorts of bad stuff would happen. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https://issues.apache.org/jira/browse/LU

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
mTrailingZeros() should be invoked prior to publishing the object for other threads for increased performance (in case you fiddle with bits and clear the tail). In the second options, your patch does a fine job of not mutating the object and correcting the bug. Thanks for an interesting discus

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
pens-before between the reads and the modifications to the object. Of course... I said "may be safely shared', not that any method one chooses to share it is correct. It still seems that promoting hashCode and equals to mutating operations is wrong, no? > OpenBitSet#hashCode() may

[jira] Issue Comment Edited: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
deadlock. Client mode and interpreted mode are not optimized, so it passes. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https://issues.apache.or

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
ll (should... or does on two machines I own) deadlock. Client mode and interpreted mode are not optimized, so it passes. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 >

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
hing to consider deeply, at least in my personal opinion. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https://issues.apache.org/jira

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
a, but hashCode and equals shouldn't modify the object's state in any meaningful way. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
ntation may be useful for folks with older VMs... > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 > URL: https://issues.apache.org/jira/browse/LUCENE-2216 >

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
other operation no >OpenBitSets is affected by the value inside wlen). Your patch also solves the issue, of course. I just don't see the point in _not_ updating wlen since you're scanning through memory anyway... The implementation of OpenBitSet is different in this regard to

[jira] Updated: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
imple solution. Start with a zero hashcode while iterating backward and the trailing zeros won't affect the hashcode. > OpenBitSet#hashCode() may return false for identical sets. > -- > > Key: LUCENE-2216 >

[jira] Commented: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Yonik Seeley (JIRA)
s it only adds to the cost of hashCode/equals (which are already very expensive with large bitsets and should be avoided if possible anyway). > OpenBitSet#hashCode() may return false for identical sets. > -- > >

[jira] Updated: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-2216: Attachment: openbitset.patch > OpenBitSet#hashCode() may return false for identical s

[jira] Created: (LUCENE-2216) OpenBitSet#hashCode() may return false for identical sets.

2010-01-16 Thread Dawid Weiss (JIRA)
OpenBitSet#hashCode() may return false for identical sets. -- Key: LUCENE-2216 URL: https://issues.apache.org/jira/browse/LUCENE-2216 Project: Lucene - Java Issue Type: Bug

[jira] Commented: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-09 Thread Uwe Schindler (JIRA)
ould initialize the OpenBitSet in Collector.setNextReader(). > Inefficient growth of OpenBitSet > > > Key: LUCENE-1899 > URL: https://issues.apache.org/jira/browse/LUCENE-1899 > Project: Lucene - Java

[jira] Resolved: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1899. Resolution: Fixed Thanks Nadav! > Inefficient growth of OpenBit

[jira] Commented: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-09 Thread Nadav Har'El (JIRA)
r actually, 11%=0.125/(1+0.125) of the space after an elargment is wasted. I don't know where I got this 6% from ;-) > Inefficient growth of OpenBitSet > > > Key: LUCENE-1899 > URL: https://issues.

[jira] Updated: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-09 Thread Michael McCandless (JIRA)
... > Inefficient growth of OpenBitSet > > > Key: LUCENE-1899 > URL: https://issues.apache.org/jira/browse/LUCENE-1899 > Project: Lucene - Java > Issue Type: Bug > Components: Store &g

[jira] Updated: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1899: --- Fix Version/s: 2.9 > Inefficient growth of OpenBit

[jira] Commented: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-09 Thread Michael McCandless (JIRA)
collection stops growing and is re-used for a long time, in which case the long-term wasted RAM is (I think) more important than the one-time short-term CPU cost of finding the "natural" size. > Inefficient growth of OpenBitSet > > >

[jira] Assigned: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-09 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1899: -- Assignee: Michael McCandless > Inefficient growth of OpenBit

[jira] Commented: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-08 Thread Nadav Har'El (JIRA)
.5, 0.01 or 1.0? I'm not saying that 1.0 (doubling) is best, just that I don't know why 0.125 is. > Inefficient growth of OpenBitSet > > > Key: LUCENE-1899 > URL: https://issues.apache.org/jira/browse

[jira] Commented: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-08 Thread Michael McCandless (JIRA)
re's ArrayUtil.getNextSize (a Lucene class) which seems to grow arrays in a mild fashion. the method is well documented, and I think it should be used by ensureCapacityWords. +1 > Inefficient growth of OpenBitSet > > > Key: LUCENE-1899 >

[jira] Commented: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-08 Thread Shai Erera (JIRA)
ize every time. There's ArrayUtil.getNextSize (a Lucene class) which seems to grow arrays in a mild fashion. the method is well documented, and I think it should be used by ensureCapacityWords. > Inefficient growth of OpenBitSet > > >

[jira] Created: (LUCENE-1899) Inefficient growth of OpenBitSet

2009-09-08 Thread Nadav Har'El (JIRA)
Inefficient growth of OpenBitSet Key: LUCENE-1899 URL: https://issues.apache.org/jira/browse/LUCENE-1899 Project: Lucene - Java Issue Type: Bug Components: Store Affects Versions: 2.9

[jira] Updated: (LUCENE-1767) Add sizeof to OpenBitSet

2009-08-09 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated LUCENE-1767: Fix Version/s: (was: 2.9) 3.1 > Add sizeof to OpenBit

[jira] Commented: (LUCENE-1767) Add sizeof to OpenBitSet

2009-08-06 Thread Mark Miller (JIRA)
s someone speaks up > Add sizeof to OpenBitSet > > > Key: LUCENE-1767 > URL: https://issues.apache.org/jira/browse/LUCENE-1767 > Project: Lucene - Java > Issue Type: Improvement > Compon

[jira] Commented: (LUCENE-1767) Add sizeof to OpenBitSet

2009-08-04 Thread Mark Miller (JIRA)
tter want to take this on in the next couple days? If not, I'm going to push it out of 2.9. > Add sizeof to OpenBitSet > > > Key: LUCENE-1767 > URL: https://issues.apache.org/jira/browse/LUCENE-1767 >

[jira] Commented: (LUCENE-1767) Add sizeof to OpenBitSet

2009-07-29 Thread Simon Willnauer (JIRA)
r confuse users / developers. If we add it I would rather go for a very meaningful name like allocatedBytes. simon > Add sizeof to OpenBitSet > > > Key: LUCENE-1767 > URL: https://issues.apache.org/jira/browse/LUCENE-1767 >

[jira] Updated: (LUCENE-1767) Add sizeof to OpenBitSet

2009-07-29 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated LUCENE-1767: - Attachment: LUCENE-1767.patch Added sizeOf method > Add sizeof to OpenBit

[jira] Created: (LUCENE-1767) Add sizeof to OpenBitSet

2009-07-29 Thread Jason Rutherglen (JIRA)
Add sizeof to OpenBitSet Key: LUCENE-1767 URL: https://issues.apache.org/jira/browse/LUCENE-1767 Project: Lucene - Java Issue Type: Improvement Components: Index Affects Versions: 2.4.1

[jira] Commented: (LUCENE-1485) Use OpenBitSet instead of BitVector in SegmentReader

2008-12-11 Thread Yonik Seeley (JIRA)
iven OpenBitSet is supposed to be the "fastest" bitset For doing population counts, and intersection/union population counts, yes. And it's also "Open" so if there is a faster method of doing something, it can still be done. The point was not to make a faster get(bitnum) -

[jira] Commented: (LUCENE-1485) Use OpenBitSet instead of BitVector in SegmentReader

2008-12-11 Thread Jason Rutherglen (JIRA)
t;slightly" in the noise? " Seems to be. Perhaps it needs more performance tests. It is somewhat surprising given OpenBitSet is supposed to be the "fastest" bitset. It seems that Lucene should have ways to incorporate new bitset implementations in the future using interfac

[jira] Commented: (LUCENE-1485) Use OpenBitSet instead of BitVector in SegmentReader

2008-12-10 Thread Grant Ingersoll (JIRA)
noise? > Use OpenBitSet instead of BitVector in SegmentReader > > > Key: LUCENE-1485 > URL: https://issues.apache.org/jira/browse/LUCENE-1485 > Project: Lucene - Java &g

[jira] Commented: (LUCENE-1485) Use OpenBitSet instead of BitVector in SegmentReader

2008-12-09 Thread Jason Rutherglen (JIRA)
the -client option in the JVM on Mac OS X. Using -server the numbers look almost the same for OpenBitSet and BitVector with BitVector being slightly faster. > Use OpenBitSet instead of BitVector in SegmentReader > > >

[jira] Updated: (LUCENE-1485) Use OpenBitSet instead of BitVector in SegmentReader

2008-12-09 Thread Jason Rutherglen (JIRA)
BitVector and OpenBitSet. FastGet is called on OpenBitSet. > Use OpenBitSet instead of BitVector in SegmentReader > > > Key: LUCENE-1485 > URL: https://issues.apache.org/jira/browse/LUCENE-1485 >

[jira] Updated: (LUCENE-1485) Use OpenBitSet instead of BitVector in SegmentReader

2008-12-09 Thread Jason Rutherglen (JIRA)
h are about the same after running 25 times in milliseconds. It is assumed that implementing DocIdSetIterator in SegmentTermDocs will speed things up more. bit set size: 10,485,760 set bits count: 524,032 openbitset: 68 bitvector: 89 24% speed increase. I will implement a patch that add

[jira] Created: (LUCENE-1485) Use OpenBitSet instead of BitVector in SegmentReader

2008-12-09 Thread Jason Rutherglen (JIRA)
Use OpenBitSet instead of BitVector in SegmentReader Key: LUCENE-1485 URL: https://issues.apache.org/jira/browse/LUCENE-1485 Project: Lucene - Java Issue Type: Improvement

[jira] Resolved: (LUCENE-1467) Consolidate Solr's and Lucene's OpenBitSet classes

2008-11-25 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch resolved LUCENE-1467. --- Resolution: Fixed Committed revision 720609. > Consolidate Solr's and Lucene'

[jira] Commented: (LUCENE-1467) Consolidate Solr's and Lucene's OpenBitSet classes

2008-11-25 Thread Michael Busch (JIRA)
ment which states that, I will make it clearer in the javadocs of nextDoc(). > Consolidate Solr's and Lucene's OpenBitSet classes > -- > > Key: LUCENE-1467 > URL: https://issues.apach

[jira] Commented: (LUCENE-1467) Consolidate Solr's and Lucene's OpenBitSet classes

2008-11-25 Thread Michael McCandless (JIRA)
and next(int) return when there are no more docs (ie the iterator is done)? > Consolidate Solr's and Lucene's OpenBitSet classes > -- > > Key: LUCENE-1467 > URL: https://issues.apach

[jira] Updated: (LUCENE-1467) Consolidate Solr's and Lucene's OpenBitSet classes

2008-11-24 Thread Michael Busch (JIRA)
ay or so. > Consolidate Solr's and Lucene's OpenBitSet classes > -- > > Key: LUCENE-1467 > URL: https://issues.apache.org/jira/browse/LUCENE-1467 > Project: Lucene -

[jira] Updated: (LUCENE-1467) Consolidate Solr's and Lucene's OpenBitSet classes

2008-11-23 Thread Michael Busch (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Busch updated LUCENE-1467: -- Priority: Minor (was: Major) > Consolidate Solr's and Lucene's OpenB

[jira] Created: (LUCENE-1467) Consolidate Solr's and Lucene's OpenBitSet classes

2008-11-23 Thread Michael Busch (JIRA)
Consolidate Solr's and Lucene's OpenBitSet classes -- Key: LUCENE-1467 URL: https://issues.apache.org/jira/browse/LUCENE-1467 Project: Lucene - Java Issue Type: Task

Re: OpenBitSet

2006-06-07 Thread Yonik Seeley
FYI, I updated the bug with Operon performance numbers. More in line with what I originally expected - the intersection count functions are the true standouts, and what you care about for faceted browsing. anything else is gravy. http://issues.apache.org/jira/browse/SOLR-15 -Yonik http://incubat

Re: OpenBitSet

2006-05-16 Thread eks dev
head spinning idea, to utilize graphics card HW to do super fast bit vector operations. These thingies today are really optimized for basic bit operations. I am just curious to see what he comes up with. I hope I will have some time next week or so to polish some tests for OpenBitSet a bi

Re: OpenBitSet

2006-05-16 Thread Chris Hostetter
: I measured also on different densities, and it looks about the same. : When I find a few spare minutes will make one PerfTest that generates : gnuplot diagrams. Wold be interesting to see how all key methods behave : as a function of density/size. I was thinking the same thing ... i just haven'

Re: OpenBitSet

2006-05-16 Thread eks dev
>Weird... I'm not sure how that could be. Are you sure you didn't get >the numbers reversed? that is exactly what happend, sorry for wrong numbers, now it looks as it should: java -version Java(TM) SE Runtime Environment (build 1.6.0-beta2-b83) Java HotSpot(TM) Client VM (build 1.6.0-beta2-b8

Re: OpenBitSet

2006-05-15 Thread Yonik Seeley
;t get the numbers reversed? I just tried 1.6, and bitset/openbitset = 1.26 for me. Are any memory controllers optimized for forward streaming more than reverse? My union loop counts down to zero, which is often faster since the register status flags are already set as the result of the decremen

Re: OpenBitSet

2006-05-14 Thread eks dev
this Yonik, Hoss and Paul already made rather acceptable extend/deprecate plans. Maybe separate package for various BitSet / IntegerSet implementation would not be such a bad idea as there is no single best implementation? Just let me remind on what we have around: BitSet (OpenBitSet

Re: OpenBitSet

2006-05-12 Thread Yonik Seeley
ntz8 or ntz8a could possibly be faster than what I have now for low density bit sets: http://www.hackersdelight.org/HDcode/ntz.cc I don't know how to expand those to 64 bit, but they could always be used on the two 32 bit chunks I guess. Anyway, for higher density bit sets, my current implementa

Re: OpenBitSet

2006-05-12 Thread Yonik Seeley
Code is here for those interested: http://issues.apache.org/jira/browse/SOLR-15 -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional comman

Re: OpenBitSet

2006-05-12 Thread Yonik Seeley
Oh, and the performance for nextSetBit() was 46% faster (at least on my box at home, which I developed on, and hence this stuff is tuned for). -Yonik On 5/12/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > Is there also a nextSetBit(bitNr) somewhere on http://www.hackersdelight.org ? > This metho

Re: OpenBitSet

2006-05-12 Thread Yonik Seeley
Is there also a nextSetBit(bitNr) somewhere on http://www.hackersdelight.org ? This method is essential for filtering a query search. They have some algorithms for ntz (number of trailing zeros) for a single int value. That's the harder part. Using ntz to implement nextSetBit in an int or arra

Re: OpenBitSet

2006-05-12 Thread Ype Kingma
t; > > so, where it belongs. > > > - lucene.util? BitSet is hard-coded into Lucene in enough places that > > > I don't know if it would be useful to people there or not. > > > - solr.util? > > > > > > The next step would be to actually use it... replacing B

Re: OpenBitSet

2006-05-12 Thread Paul Elschot
t; > > so, where it belongs. > > > - lucene.util?  BitSet is hard-coded into Lucene in enough places that > > > I don't know if it would be useful to people there or not. > > > - solr.util? > > > > > > The next step would be to actually use it... replacing B