date:20110613

[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2011-06-13 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049014#comment-13049014
 ] 

Simon Willnauer commented on SOLR-2242:
---

Bill, this seems like an important issue. Many votes etc. I am on travel right 
now so give me some days to come back and I will work with you to get this done.
Thanks for your patience

simon

> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> 
>   
> 14
> 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111
>   
> 
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-2242) Get distinct count of names for a facet field

2011-06-13 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer reassigned SOLR-2242:
-

Assignee: Simon Willnauer

> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> 
>   
> 14
> 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111
>   
> 
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2523) SolrJ QueryResponse doesn't support range facets

2011-06-13 Thread Martijn van Groningen (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049012#comment-13049012
 ] 

Martijn van Groningen commented on SOLR-2523:
-

Solr-1896 looks like what this issue needs. The change that I suggested 
involves having the datemath syntax in regular Solr query parser. But I think 
solr-1896 is just fine for this issue. Maybe changing the regular query parser 
is a bit too much

> SolrJ QueryResponse doesn't support range facets
> 
>
> Key: SOLR-2523
> URL: https://issues.apache.org/jira/browse/SOLR-2523
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Trivial
> Fix For: 3.3, 4.0
>
> Attachments: SOLR-2523.patch, SOLR-2523.patch
>
>
> It is possible to get date facets and pivot facets in SolrJ.
> {code:java}
> queryResponse.getFacetDate();
> queryResponse.getFacetPivot();
> {code}
> Having this also for range fields would be nice. Adding this is trivial. 
> Maybe we should deprecate date facet methods in QueryResponse class? Since it 
> is superseded by range facets. Also some set / add / remove methods for 
> setting facet range parameters on the SolrQuery class would be nice.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Facet path

2011-06-13 Thread Martijn v Groningen

I think it can be a subtask of LUCENE-3079 and we should first focus on the
general faceting features. As far as I know there is no bitset impl. out
there for faceting.
Op 14 jun. 2011 00:08 schreef "Jason Rutherglen" 
het volgende:
> Martijn, If the title is correct "Post grouping faceting" then maybe
> the bit set based system should be a separate issue? Eg, is there a
> bit set implementation today in LUCENE-3079?
>
> On Mon, Jun 13, 2011 at 2:58 PM, Martijn v Groningen
>  wrote:
>> There is already an issue open for this:
>> LUCENE-3079
>>
>> As the issues describes, the faceting in Solr relies on the schema (and
off
>> course the UIF).
>> So having the noting of a FieldType in the facet module would be very
>> helpful for selecting the right facet implementation.
>> Currently in Solr there is only one facet method for field facet that
work
>> per-segment,
>> but I think in the end we would want all facet types and methods to work
on
>> a per-segment basis.
>> Martijn
>> On 13 June 2011 23:47, Jason Rutherglen 
wrote:
>>>
>>> I think it's a better approach than rewriting Solr's internals.  Eg,
>>> small development steps could be taken, using the knowledge learned
>>> from Solr's facet system.  Eg, caching and intersecting bit sets would
>>> be an easy-ish first step?
>>>
>>> On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer
>>>  wrote:
>>> > I believe people are already looking into that but I am not sure.
>>> > sounds reasonable to me but I think its going to be lots of work
>>> >
>>> > simon
>>> >
>>> > On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen
>>> >  wrote:
>>> >> Are we going the direction of creating full facet features outside of
>>> >> Solr?  Eg, we have UIF extrapolated out, we can probably make a
module
>>> >> for bit set intersections as well.  In the process the faceting will
>>> >> go per-segment.
>>> >>
>>> >> -
>>> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> >> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >>
>>> >>
>>> >
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>> >
>>> >
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

Re: commit-check target for ant?

2011-06-13 Thread Dawid Weiss

> about, we could have the same argument about requiring 1.7 to compile but
> supporting binary releases that run on 1.6; or an argument about wether we
> should use a commercial tool that commiters have a license for to "build"
> java code from a source grammer -- the point is that as an open source
> project i think it's really important that *all* our users be allowed to

I think I got lost in this discussion somehow -- users will be able to
compile from sources, just not with the 1.5 compiler... But they'd
still be able to compile their source code to 1.5 binaries with an
open source toolchain. Anyway, to me, the points for 1.6 are:

1) array resizing intrinsics in Arrays.* (at least theoretically, no
need for double allocation during array resizing).
2) @Override on interface overriding methods.

None of these are critical, but retroweaving back to 1.5 bytecode can
provide you a clean way of using both and keep 1.5  users happy. Liked
you mentioned -- this probably does come down to the argument of
switching to 1.6 as the supported platform; then, 1.5 compatibility
backport would be a nice touch for those, who still need it.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2551) Checking dataimport.properties for write access during startup

2011-06-13 Thread C S (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049000#comment-13049000
 ] 

C S commented on SOLR-2551:
---

I might be wrong but isn't the dataimport.properties written even if no 
delta-query is configured? So even if you'd just be interested in full imports, 
you'll run into an exception at the end of your full import when solr attempts 
to write the timestamp into dataimport.properties. 

However, if that's not the case (i.e. a non-writable dataimport.properties does 
not break a full import), i'd suggest that the check if dataimport.properties 
is writable should only be done if a delta-query is defined, and in this case 
it should refuse to start the import.


> Checking dataimport.properties for write access during startup
> --
>
> Key: SOLR-2551
> URL: https://issues.apache.org/jira/browse/SOLR-2551
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4.1, 3.1
>Reporter: C S
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> A common mistake is that the /conf (respectively the dataimport.properties) 
> file is not writable for solr. It would be great if that were detected on 
> starting a dataimport job. 
> Currently and import might grind away for days and fail if it can't write its 
> timestamp to the dataimport.properties file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3199) Add non-desctructive sort to BytesRefHash

2011-06-13 Thread Jason Rutherglen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048987#comment-13048987
 ] 

Jason Rutherglen commented on LUCENE-3199:
--

I think the issue with this, as it relates to realtime search, is in order to 
sort, we'll need to freeze indexing.

> Add non-desctructive sort to BytesRefHash
> -
>
> Key: LUCENE-3199
> URL: https://issues.apache.org/jira/browse/LUCENE-3199
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index
>Affects Versions: 4.0
>Reporter: Jason Rutherglen
>Priority: Minor
>
> Currently the BytesRefHash is destructive.  We can add a method that returns 
> a non-destructively generated int[].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-trunk - Build # 1594 - Still Failing

2011-06-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-trunk/1594/

3 tests failed.
FAILED:  
org.apache.lucene.index.TestPerFieldCodecSupport.testStressPerFieldCodec

Error Message:
GC overhead limit exceeded

Stack Trace:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.lucene.util.fst.FST.getBytesReader(FST.java:849)
at org.apache.lucene.util.fst.FST.readFirstRealArc(FST.java:565)
at org.apache.lucene.util.fst.NodeHash.hash(NodeHash.java:92)
at org.apache.lucene.util.fst.NodeHash.addNew(NodeHash.java:141)
at org.apache.lucene.util.fst.NodeHash.rehash(NodeHash.java:161)
at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:126)
at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:146)
at org.apache.lucene.util.fst.Builder.compilePrevTail(Builder.java:232)
at org.apache.lucene.util.fst.Builder.add(Builder.java:349)
at org.apache.lucene.util.fst.Builder.add(Builder.java:262)
at 
org.apache.lucene.index.codecs.simpletext.SimpleTextFieldsReader$SimpleTextTerms.loadTerms(SimpleTextFieldsReader.java:494)
at 
org.apache.lucene.index.codecs.simpletext.SimpleTextFieldsReader$SimpleTextTerms.(SimpleTextFieldsReader.java:460)
at 
org.apache.lucene.index.codecs.simpletext.SimpleTextFieldsReader.terms(SimpleTextFieldsReader.java:561)
at 
org.apache.lucene.index.PerFieldCodecWrapper$FieldsReader$FieldsIterator.terms(PerFieldCodecWrapper.java:152)
at 
org.apache.lucene.index.MultiFieldsEnum.terms(MultiFieldsEnum.java:113)
at 
org.apache.lucene.index.codecs.FieldsConsumer.merge(FieldsConsumer.java:50)
at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:573)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:116)
at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3461)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3105)
at 
org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1873)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1868)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1864)
at 
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1482)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1234)
at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1215)
at 
org.apache.lucene.index.TestPerFieldCodecSupport.testStressPerFieldCodec(TestPerFieldCodecSupport.java:306)


FAILED:  org.apache.lucene.search.TestPhraseQuery.testRandomPhrases

Error Message:
GC overhead limit exceeded

Stack Trace:
java.lang.OutOfMemoryError: GC overhead limit exceeded
at 
org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexInput.reader(FixedIntBlockIndexInput.java:51)
at 
org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexInput.reader(FixedIntBlockIndexInput.java:39)
at 
org.apache.lucene.index.codecs.sep.SepPostingsReaderImpl$SepDocsAndPositionsEnum.(SepPostingsReaderImpl.java:522)
at 
org.apache.lucene.index.codecs.sep.SepPostingsReaderImpl.docsAndPositions(SepPostingsReaderImpl.java:283)
at 
org.apache.lucene.index.codecs.BlockTermsReader$FieldReader$SegmentTermsEnum.docsAndPositions(BlockTermsReader.java:707)
at 
org.apache.lucene.index.MultiTermsEnum.docsAndPositions(MultiTermsEnum.java:388)
at 
org.apache.lucene.index.codecs.TermsConsumer.merge(TermsConsumer.java:92)
at 
org.apache.lucene.index.codecs.FieldsConsumer.merge(FieldsConsumer.java:53)
at 
org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:573)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:116)
at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3461)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3105)
at 
org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37)
at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1873)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1683)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1638)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1608)
at 
org.apache.lucene.index.RandomIndexWriter.doRandomOptimize(RandomIndexWriter.java:315)
at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:328)
at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:308)
at 
org.apache.lucene.search.TestPhraseQuery.testRandomPhrases(TestPhraseQuery.java:662)


FAILED:  org.apache.lucene.util.fst.TestFSTs.testBigSet

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: J

[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2011-06-13 Thread Bill Bell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048967#comment-13048967
 ] 

Bill Bell commented on SOLR-2242:
-

Lance,

This patch just takes the # of lines coming out of the facet section for a 
field and tells you how many you have.

It does not do anything to change the facet, or deal with white space, or 
anything complicated.

This is a simple counter.

Bill


> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> 
>   
> 14
> 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111
>   
> 
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2011-06-13 Thread Bill Bell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048966#comment-13048966
 ] 

Bill Bell commented on SOLR-2242:
-

Thanks Mike.

I think it is committable since shards work now. We might need to fix some 
broken tests (and I am willing to do that).

Then we can move to range and queries...

Thanks. 

> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> 
>   
> 14
> 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111
>   
> 
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

2011-06-13 Thread Bill Bell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bell updated SOLR-2242:


Description: 
When returning facet.field= you will get a list of matches for 
distinct values. This is normal behavior. This patch tells you how many 
distinct values you have (# of rows). Use with limit=-1 and mincount=1.


The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price

This currently only works on facet.field.

{code}


  
14
31
  


{code} 

Several people use this to get the group.field count (the # of groups).



  was:
When returning facet.field= you will get a list of matches for 
distinct values. This is normal behavior. This patch tells you how many 
distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price

Here is an example on field "hgid" (without namedistinct):

{code}
- 
- 
  1 
  1 
  1 
  1 
  1 
  5 
  1 
  
  
{code}

With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), 
not the number of values (11).

{code}
- 
- 
  7 
  
  
{code}
This works actually really good to get total number of fields for a 
group.field=hgid. Enjoy!


> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> This currently only works on facet.field.
> {code}
> 
>   
> 14
> 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111
>   
> 
> {code} 
> Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

2011-06-13 Thread Bill Bell (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bell updated SOLR-2242:


Description: 
When returning facet.field= you will get a list of matches for 
distinct values. This is normal behavior. This patch tells you how many 
distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price

Here is an example on field "hgid" (without namedistinct):

{code}
- 
- 
  1 
  1 
  1 
  1 
  1 
  5 
  1 
  
  
{code}

With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), 
not the number of values (11).

{code}
- 
- 
  7 
  
  
{code}
This works actually really good to get total number of fields for a 
group.field=hgid. Enjoy!

  was:
When returning facet.field= you will get a list of matches for 
distinct values. This is normal behavior. This patch tells you how many 
distinct values you have (# of rows). Use with limit=-1 and mincount=1.



The feature is called "namedistinct". Here is an example:

http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1

Here is an example on field "hgid" (without namedistinct):

{code}
- 
- 
  1 
  1 
  1 
  1 
  1 
  5 
  1 
  
  
{code}

With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), 
not the number of values (11).

{code}
- 
- 
  7 
  
  
{code}
This works actually really good to get total number of fields for a 
group.field=hgid. Enjoy!


> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price
> http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - 
> - 
>   1 
>   1 
>   1 
>   1 
>   1 
>   5 
>   1 
>   
>   
> {code}
> With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
> HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
> HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows 
> (7), not the number of values (11).
> {code}
> - 
> - 
>   7 
>   
>   
> {code}
> This works actually really good to get total number of fields for a 
> group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2554) RandomSortField values are cached in the FieldCache

2011-06-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048965#comment-13048965
 ] 

Yonik Seeley commented on SOLR-2554:


Thanks Vadim, I can reproduce this on 3.1 and branch_3x (but trunk seems to 
work fine), and I'll look into fixing it tomorrow.

> RandomSortField values are cached in the FieldCache
> ---
>
> Key: SOLR-2554
> URL: https://issues.apache.org/jira/browse/SOLR-2554
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.1
>Reporter: Vadim Geshel
>
> The values of RandomSortField get cached in the FieldCache. When using many 
> RandomSortFields over time, this leads to running out of memory.
> This may be one of the cases already covered in SOLR- but I'm not sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2011-06-13 Thread Bill Bell (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048964#comment-13048964
 ] 

Bill Bell commented on SOLR-2242:
-

Lance,

There is literally 15 lines of code changes. Not sure how you cannot follow it. 
I could use no memory and just loop through the results, but that would not be 
cached - so the speed would still be slow since I need to pull in the array in 
order to count it.

The field is not called namedistinct anymore... It is called 
facet.numFacetTerms=2,1,0.

All other parameters are good. Also you do not need anything else to get it to 
work, since I set the defaults to work for you now.

I'll see if I can write some more tests. Here is the rub: I would be happy to 
wrote hundreds of test cases if I knew someone was going to actually help me 
get this done. I am used to having a committer actually work with me - Mike 
McCandless is awesome and we worked on several issues together. But I have seen 
tons of features die when no one is willing to help. So here I am wanting, 
willing and able to get this done. And I have no one willing to assist from a 
committer perspective... The patch works fine in sharded and normal mode. So 
people can use it today. It is just not committed.

I have 4 clients using it in production and one has 100M page views a year, and 
so far no problems.

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price




> Get distinct count of names for a facet field
> -
>
> Key: SOLR-2242
> URL: https://issues.apache.org/jira/browse/SOLR-2242
> Project: Solr
>  Issue Type: New Feature
>  Components: Response Writers
>Affects Versions: 4.0
>Reporter: Bill Bell
>Priority: Minor
> Fix For: 4.0
>
> Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
> SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch
>
>
> When returning facet.field= you will get a list of matches for 
> distinct values. This is normal behavior. This patch tells you how many 
> distinct values you have (# of rows). Use with limit=-1 and mincount=1.
> The feature is called "namedistinct". Here is an example:
> http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1
> Here is an example on field "hgid" (without namedistinct):
> {code}
> - 
> - 
>   1 
>   1 
>   1 
>   1 
>   1 
>   5 
>   1 
>   
>   
> {code}
> With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
> HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
> HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows 
> (7), not the number of values (11).
> {code}
> - 
> - 
>   7 
>   
>   
> {code}
> This works actually really good to get total number of fields for a 
> group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3200:


Attachment: LUCENE-3200.patch

same as uwe's patch, but i also nuked the previous hack in 
TestTermVectorsReader, as MMapDir returns read past EOF now like the others.

> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: LUCENE-3200.patch, LUCENE-3200.patch, LUCENE-3200.patch, 
> LUCENE-3200.patch, LUCENE-3200_tests.patch
>
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3200:
--

Attachment: LUCENE-3200.patch

Same patch with Robert's tests included.

> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: LUCENE-3200.patch, LUCENE-3200.patch, LUCENE-3200.patch, 
> LUCENE-3200_tests.patch
>
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3200:
--

Attachment: LUCENE-3200.patch

New patch with some minor issues fixed:
- fixed the RuntimeException
- fixed readByte to throw EOF if we are at the end of the n-1 th buffer. as 
buffer n may be size 0, we will throw BufferUnderFlow in the chatch block. I 
added hasRemaining() there, so its consistent with readBytes.
- The check for an invalid power was bogus (0 is allowed, leads to buffer size 
1)
- The check for RandomAccessFile too big for maximum buffer size did not 
respect the additional buffer. nrBuffers can then overflow easily


> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: LUCENE-3200.patch, LUCENE-3200.patch, 
> LUCENE-3200_tests.patch
>
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3200:


Attachment: LUCENE-3200_tests.patch

here are some additional stress tests for mmap

> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: LUCENE-3200.patch, LUCENE-3200_tests.patch
>
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (SOLR-2554) RandomSortField values are cached in the FieldCache

2011-06-13 Thread Vadim Geshel (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Geshel reopened SOLR-2554:



Sorry, I should have been more specific. This happens if you use a 
RandomSortField in a query, not as a sort criterion:

http://localhost:8983/solr/select/?q={!func}random_foo

You should immediately see this in stats.jsp#cache, I see this:

entry#1 : 
'org.apache.lucene.store.MMapDirectory$MMapIndexInput@37f02eaa'=>'random_foo',class
 
org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#2138852435
 

I'm using Solr 3.1

> RandomSortField values are cached in the FieldCache
> ---
>
> Key: SOLR-2554
> URL: https://issues.apache.org/jira/browse/SOLR-2554
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.1
>Reporter: Vadim Geshel
>
> The values of RandomSortField get cached in the FieldCache. When using many 
> RandomSortFields over time, this leads to running out of memory.
> This may be one of the cases already covered in SOLR- but I'm not sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8823 - Still Failing

2011-06-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/8823/

No tests ran.

Build Log (for compile errors):
[...truncated 62 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8819 - Still Failing

2011-06-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8819/

No tests ran.

Build Log (for compile errors):
[...truncated 62 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8822 - Failure

2011-06-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/8822/

No tests ran.

Build Log (for compile errors):
[...truncated 62 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8818 - Failure

2011-06-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8818/

1 tests failed.
REGRESSION:  org.apache.solr.common.util.ContentStreamTest.testURLStream

Error Message:
Server returned HTTP response code: 403 for URL: 
http://svn.apache.org/repos/asf/lucene/dev/trunk/

Stack Trace:
java.io.IOException: Server returned HTTP response code: 403 for URL: 
http://svn.apache.org/repos/asf/lucene/dev/trunk/
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1269)
at 
org.apache.solr.common.util.ContentStreamTest.testURLStream(ContentStreamTest.java:74)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1403)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1321)




Build Log (for compile errors):
[...truncated 8444 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048923#comment-13048923
 ] 

Robert Muir commented on LUCENE-3200:
-

at a glance the patch is looking really good overall! I'll help with some 
review and testing.

> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: LUCENE-3200.patch
>
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-3200:
-

Assignee: Uwe Schindler

> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>Assignee: Uwe Schindler
> Attachments: LUCENE-3200.patch
>
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3200:
--

Attachment: LUCENE-3200.patch

Here the patch.

> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
> Attachments: LUCENE-3200.patch
>
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-3.x - Build # 407 - Still Failing

2011-06-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-3.x/407/

2 tests failed.
FAILED:  org.apache.lucene.search.TestPhraseQuery.testRandomPhrases

Error Message:
this writer hit an OutOfMemoryError; cannot complete optimize

Stack Trace:
java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot 
complete optimize
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2527)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2475)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2445)
at 
org.apache.lucene.index.RandomIndexWriter.doRandomOptimize(RandomIndexWriter.java:179)
at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:195)
at 
org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:189)
at 
org.apache.lucene.search.TestPhraseQuery.testRandomPhrases(TestPhraseQuery.java:662)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1268)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1186)


FAILED:  org.apache.lucene.util.fst.TestFSTs.testBigSet

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.resize(HashMap.java:479)
at java.util.HashMap.addEntry(HashMap.java:772)
at java.util.HashMap.put(HashMap.java:402)
at 
org.apache.lucene.util.fst.TestFSTs$FSTTester.verifyPruned(TestFSTs.java:791)
at 
org.apache.lucene.util.fst.TestFSTs$FSTTester.doTest(TestFSTs.java:499)
at 
org.apache.lucene.util.fst.TestFSTs$FSTTester.doTest(TestFSTs.java:363)
at org.apache.lucene.util.fst.TestFSTs.doTest(TestFSTs.java:211)
at 
org.apache.lucene.util.fst.TestFSTs.testRandomWords(TestFSTs.java:944)
at org.apache.lucene.util.fst.TestFSTs.testBigSet(TestFSTs.java:964)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1268)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1186)




Build Log (for compile errors):
[...truncated 12491 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-06-13 Thread Phillipe Ramalho (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phillipe Ramalho updated LUCENE-2979:
-

Attachment: LUCENE-2979_phillipe_reamalho.patch

This is finally my first patch. Sorry for taking so long, but I started 
changing the API and it broke a lot of code, which took forever to fix. Now 
it's working and all junits are passing.

So far, I changed the entire configuration API. Next step is to write more 
junits and update/write javadocs.

> Simplify configuration API of contrib Query Parser
> --
>
> Key: LUCENE-2979
> URL: https://issues.apache.org/jira/browse/LUCENE-2979
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: modules/other
>Affects Versions: 2.9, 3.0
>Reporter: Adriano Crestani
>Assignee: Adriano Crestani
>  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
> Fix For: 3.3
>
> Attachments: LUCENE-2979_phillipe_reamalho.patch
>
>
> The current configuration API is very complicated and inherit the concept 
> used by Attribute API to store token information in token streams. However, 
> the requirements for both (QP config and token stream) are not the same, so 
> they shouldn't be using the same thing.
> I propose to simplify QP config and make it less scary for people intending 
> to use contrib QP. The task is not difficult, it will just require a lot of 
> code change and figure out the best way to do it. That's why it's a good 
> candidate for a GSoC project.
> I would like to hear good proposals about how to make the API more friendly 
> and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2554) RandomSortField values are cached in the FieldCache

2011-06-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048916#comment-13048916
 ] 

Yonik Seeley commented on SOLR-2554:


The reporter probably meant the filterCache (although the filterCache should be 
sized to avoid OOM errors).
Anyway, I plan on starting work soon on a "cache=false" option for queries.


> RandomSortField values are cached in the FieldCache
> ---
>
> Key: SOLR-2554
> URL: https://issues.apache.org/jira/browse/SOLR-2554
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.1
>Reporter: Vadim Geshel
>
> The values of RandomSortField get cached in the FieldCache. When using many 
> RandomSortFields over time, this leads to running out of memory.
> This may be one of the cases already covered in SOLR- but I'm not sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: tentative! release notes drafts

2011-06-13 Thread Robert Muir

On Mon, Jun 13, 2011 at 8:14 PM, Chris Hostetter
 wrote:
>
> : Since nobody objected to the idea, I created the following templates
> : for 3.3, and added a few already-committed things.
>
> I like the idea ... but why not just keep in in SVN?
>
> (that way patches can suggest wording for hte release notes if/when the
> patch is contains a notable feature worthy of "Release Highlights")
>

no reason really, I didn't think of putting it in SVN and the wiki was
easy at the time... we could just as easy put it in SVN instead...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: tentative! release notes drafts

2011-06-13 Thread Chris Hostetter


: Since nobody objected to the idea, I created the following templates
: for 3.3, and added a few already-committed things.

I like the idea ... but why not just keep in in SVN?

(that way patches can suggest wording for hte release notes if/when the 
patch is contains a notable feature worthy of "Release Highlights")



-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3201) improved compound file handling

2011-06-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048912#comment-13048912
 ] 

Robert Muir commented on LUCENE-3201:
-

I think for this one, I prefer to wait for Uwe's refactoring of MMap on 
LUCENE-3200.
Then mmap is simpler, and i think we can even use the same indexinput 
implementation here.

This would mean no slowdown when searching CFS.


> improved compound file handling
> ---
>
> Key: LUCENE-3201
> URL: https://issues.apache.org/jira/browse/LUCENE-3201
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Robert Muir
>
> Currently CompoundFileReader could use some improvements, i see the following 
> problems
> * its CSIndexInput extends bufferedindexinput, which is stupid for 
> directories like mmap.
> * it seeks on every readInternal
> * its not possible for a directory to override or improve the handling of 
> compound files.
> for example: it seems if you were impl'ing this thing from scratch, you would 
> just wrap the II directly (not extend BufferedIndexInput,
> and add compound file offset X to seek() calls, and override length(). But of 
> course, then you couldnt throw read past EOF always when you should,
> as a user could read into the next file and be left unaware.
> however, some directories could handle this better. for example MMapDirectory 
> could return an indexinput that simply mmaps the 'slice' of the CFS file.
> its underlying bytebuffer etc naturally does bounds checks already etc, so it 
> wouldnt need to be buffered, not even needing to add any offsets to seek(),
> as its position would just work.
> So I think we should try to refactor this so that a Directory can customize 
> how compound files are handled, the simplest 
> case for the least code change would be to add this to Directory.java:
> {code}
>   public Directory openCompoundInput(String filename) {
> return new CompoundFileReader(this, filename);
>   }
> {code}
> Because most code depends upon the fact compound files are implemented as a 
> Directory and transparent. at least then a subclass could override...
> but the 'recursion' is a little ugly... we could still label it 
> expert+internal+experimental or whatever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3201) improved compound file handling

2011-06-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048906#comment-13048906
 ] 

Michael McCandless commented on LUCENE-3201:


+1

> improved compound file handling
> ---
>
> Key: LUCENE-3201
> URL: https://issues.apache.org/jira/browse/LUCENE-3201
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Robert Muir
>
> Currently CompoundFileReader could use some improvements, i see the following 
> problems
> * its CSIndexInput extends bufferedindexinput, which is stupid for 
> directories like mmap.
> * it seeks on every readInternal
> * its not possible for a directory to override or improve the handling of 
> compound files.
> for example: it seems if you were impl'ing this thing from scratch, you would 
> just wrap the II directly (not extend BufferedIndexInput,
> and add compound file offset X to seek() calls, and override length(). But of 
> course, then you couldnt throw read past EOF always when you should,
> as a user could read into the next file and be left unaware.
> however, some directories could handle this better. for example MMapDirectory 
> could return an indexinput that simply mmaps the 'slice' of the CFS file.
> its underlying bytebuffer etc naturally does bounds checks already etc, so it 
> wouldnt need to be buffered, not even needing to add any offsets to seek(),
> as its position would just work.
> So I think we should try to refactor this so that a Directory can customize 
> how compound files are handled, the simplest 
> case for the least code change would be to add this to Directory.java:
> {code}
>   public Directory openCompoundInput(String filename) {
> return new CompoundFileReader(this, filename);
>   }
> {code}
> Because most code depends upon the fact compound files are implemented as a 
> Directory and transparent. at least then a subclass could override...
> but the 'recursion' is a little ugly... we could still label it 
> expert+internal+experimental or whatever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3201) improved compound file handling

2011-06-13 Thread Robert Muir (JIRA)

improved compound file handling
---

 Key: LUCENE-3201
 URL: https://issues.apache.org/jira/browse/LUCENE-3201
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Robert Muir


Currently CompoundFileReader could use some improvements, i see the following 
problems
* its CSIndexInput extends bufferedindexinput, which is stupid for directories 
like mmap.
* it seeks on every readInternal
* its not possible for a directory to override or improve the handling of 
compound files.

for example: it seems if you were impl'ing this thing from scratch, you would 
just wrap the II directly (not extend BufferedIndexInput,
and add compound file offset X to seek() calls, and override length(). But of 
course, then you couldnt throw read past EOF always when you should,
as a user could read into the next file and be left unaware.

however, some directories could handle this better. for example MMapDirectory 
could return an indexinput that simply mmaps the 'slice' of the CFS file.
its underlying bytebuffer etc naturally does bounds checks already etc, so it 
wouldnt need to be buffered, not even needing to add any offsets to seek(),
as its position would just work.

So I think we should try to refactor this so that a Directory can customize how 
compound files are handled, the simplest 
case for the least code change would be to add this to Directory.java:

{code}
  public Directory openCompoundInput(String filename) {
return new CompoundFileReader(this, filename);
  }
{code}

Because most code depends upon the fact compound files are implemented as a 
Directory and transparent. at least then a subclass could override...
but the 'recursion' is a little ugly... we could still label it 
expert+internal+experimental or whatever.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2554) RandomSortField values are cached in the FieldCache

2011-06-13 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-2554.


Resolution: Cannot Reproduce

Hmm reviewing the code i don't see any way RandomSortField would use the 
FieldCache.  (or ever could have in any previous release) 

I did some very basic testing with the example solr configs on trunk and i can 
not reproduce...

starting solr up clean, loading the sample data and then executing these 
queries...
* http://localhost:8983/solr/select/?q=*%3A*&sort=random_foo+asc
* http://localhost:8983/solr/select/?q=*%3A*&sort=random_bar+asc
* http://localhost:8983/solr/select/?q=*%3A*&sort=random_yak+asc

...i got three different orderings, but when i then checked 
http://localhost:8983/solr/admin/stats.jsp#cache i verified that fieldCache was 
empty.

If you get different results, please re-open and be specific about the version 
of solr you are using, the steps to reproduce, and the info about fieldCache 
that you get back from stats.jsp

> RandomSortField values are cached in the FieldCache
> ---
>
> Key: SOLR-2554
> URL: https://issues.apache.org/jira/browse/SOLR-2554
> Project: Solr
>  Issue Type: Bug
>  Components: search
>Affects Versions: 3.1
>Reporter: Vadim Geshel
>
> The values of RandomSortField get cached in the FieldCache. When using many 
> RandomSortFields over time, this leads to running out of memory.
> This may be one of the cases already covered in SOLR- but I'm not sure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

positional index

2011-06-13 Thread Minh Doan

Hi,

I know there is positional index in lucene implementation. If anyone is
aware of it, let me know how is it used in lucene, and which algorithms ?
Best,
---
Minh

[jira] [Commented] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048858#comment-13048858
 ] 

Robert Muir commented on LUCENE-3200:
-

also, we can fix the issue Shai brought up for the 3.1 VOTE while we are here.

in seek(long pos) i think we should do:
{code}
try {
 ...
 position()
 ...
} catch (IllegalArgumentException e) {
  if (pos < 0) 
throw exc;
  else 
throw new IOException("read past EOF"); 
}
{code}

This would be more consistent with NIOFS/SimpleFS from an exception perspective.


> Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized 
> of powers of 2
> ---
>
> Key: LUCENE-3200
> URL: https://issues.apache.org/jira/browse/LUCENE-3200
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Uwe Schindler
>
> Robert and me discussed a little bit after Mike's investigations, that using 
> SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
> slowdowns sometimes.
> We had the following ideas:
> - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
> switching between buffer boundaries is done in exception catch blocks. So 
> normal code path is always the same like for Single*
> - Only the seek method uses strange calculations (the modulo is totally 
> bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very 
> strange way of calculating modulo in the original code)
> - Because of speed we suggest to no longer use arbitrary buffer sizes. We 
> should pass only the power of 2 to the indexinput as size. All calculations 
> in seek and anywhere else would be simple bit shifts and AND operations (the 
> and masks for the modulo can be calculated in the ctor like NumericUtils does 
> when calculating precisionSteps).
> - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an 
> issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, 
> as it will no longer fit page boundaries and mmapping gets harder for the O/S.
> We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2

2011-06-13 Thread Uwe Schindler (JIRA)

Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of 
powers of 2
---

 Key: LUCENE-3200
 URL: https://issues.apache.org/jira/browse/LUCENE-3200
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Uwe Schindler


Robert and me discussed a little bit after Mike's investigations, that using 
SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot 
slowdowns sometimes.

We had the following ideas:
- MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the 
switching between buffer boundaries is done in exception catch blocks. So 
normal code path is always the same like for Single*
- Only the seek method uses strange calculations (the modulo is totally bogus, 
it could be simply: int bufOffset = (int) (pos % maxBufSize); - very strange 
way of calculating modulo in the original code)
- Because of speed we suggest to no longer use arbitrary buffer sizes. We 
should pass only the power of 2 to the indexinput as size. All calculations in 
seek and anywhere else would be simple bit shifts and AND operations (the and 
masks for the modulo can be calculated in the ctor like NumericUtils does when 
calculating precisionSteps).
- the maximum buffer size will now be 2^30, not 2^31-1. But thats not an issue 
at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, as it 
will no longer fit page boundaries and mmapping gets harder for the O/S.

We will provide a patch with those cleanups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3199) Add non-desctructive sort to BytesRefHash

2011-06-13 Thread Jason Rutherglen (JIRA)

Add non-desctructive sort to BytesRefHash
-

 Key: LUCENE-3199
 URL: https://issues.apache.org/jira/browse/LUCENE-3199
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 4.0
Reporter: Jason Rutherglen
Priority: Minor


Currently the BytesRefHash is destructive.  We can add a method that returns a 
non-destructively generated int[].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Facet path

2011-06-13 Thread Jason Rutherglen

Martijn, If the title is correct "Post grouping faceting" then maybe
the bit set based system should be a separate issue?  Eg, is there a
bit set implementation today in LUCENE-3079?

On Mon, Jun 13, 2011 at 2:58 PM, Martijn v Groningen
 wrote:
> There is already an issue open for this:
> LUCENE-3079
>
> As the issues describes, the faceting in Solr relies on the schema (and off
> course the UIF).
> So having the noting of a FieldType in the facet module would be very
> helpful for selecting the right facet implementation.
> Currently in Solr there is only one facet method for field facet that work
> per-segment,
> but I think in the end we would want all facet types and methods to work on
> a per-segment basis.
> Martijn
> On 13 June 2011 23:47, Jason Rutherglen  wrote:
>>
>> I think it's a better approach than rewriting Solr's internals.  Eg,
>> small development steps could be taken, using the knowledge learned
>> from Solr's facet system.  Eg, caching and intersecting bit sets would
>> be an easy-ish first step?
>>
>> On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer
>>  wrote:
>> > I believe people are already looking into that but I am not sure.
>> > sounds reasonable to me but I think its going to be lots of work
>> >
>> > simon
>> >
>> > On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen
>> >  wrote:
>> >> Are we going the direction of creating full facet features outside of
>> >> Solr?  Eg, we have UIF extrapolated out, we can probably make a module
>> >> for bit set intersections as well.  In the process the faceting will
>> >> go per-segment.
>> >>
>> >> -
>> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >>
>> >>
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap

2011-06-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048821#comment-13048821
 ] 

Uwe Schindler commented on LUCENE-3198:
---

Thats fine, I just wanted to talk about the whole issue what to enable when and 
bring together all possible platform possibilities. In general we should per 
default only enable SimpleFSDirectory on unknown platforms. Maybe NIO is 
heavily broken on OS XY (Android *lol*)?

> Change default Directory impl on 64bit linux to MMap
> 
>
> Key: LUCENE-3198
> URL: https://issues.apache.org/jira/browse/LUCENE-3198
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Michael McCandless
> Fix For: 3.3, 4.0
>
>
> Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle 
> 1.6.0_21) I see MMapDir getting better search and merge performance when 
> compared to NIOFSDir.
> I think we should fix the default.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2523) SolrJ QueryResponse doesn't support range facets

2011-06-13 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048820#comment-13048820
 ] 

Hoss Man commented on SOLR-2523:


bq. I'm not a date-math expert, but is there a problem with using the gap w/o 
having to parse it (i.e. can we always append it?)

that is exactly how it was designed to be used.  But ultimately i *really* want 
to implement SOLR-1896 so no client (in any language) ever has to think about 
any of this.

bq. Good idea! This would be really useful for any client. I think we can 
change this in SolrQueryParser#getRangeQuery() method or in 
DateField#parseMath(...).

i'm a little lost ... I don't understand what "change" is being suggested in 
this sentence ... can't the client already access both the values and the gap 
and concat them?




> SolrJ QueryResponse doesn't support range facets
> 
>
> Key: SOLR-2523
> URL: https://issues.apache.org/jira/browse/SOLR-2523
> Project: Solr
>  Issue Type: Improvement
>  Components: clients - java
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Trivial
> Fix For: 3.3, 4.0
>
> Attachments: SOLR-2523.patch, SOLR-2523.patch
>
>
> It is possible to get date facets and pivot facets in SolrJ.
> {code:java}
> queryResponse.getFacetDate();
> queryResponse.getFacetPivot();
> {code}
> Having this also for range fields would be nice. Adding this is trivial. 
> Maybe we should deprecate date facet methods in QueryResponse class? Since it 
> is superseded by range facets. Also some set / add / remove methods for 
> setting facet range parameters on the SolrQuery class would be nice.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath

2011-06-13 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048817#comment-13048817
 ] 

Ryan McKinley commented on SOLR-2588:
-

"bug" is a strech...  I think what this is getting at is that velocity is now 
required for solr to work at all.  With some small changes, Velocity could be 
optional.

I think somethign as easy as:
{code}
Index: solr/src/java/org/apache/solr/core/SolrCore.java
===
--- solr/src/java/org/apache/solr/core/SolrCore.java(revision 1134331)
+++ solr/src/java/org/apache/solr/core/SolrCore.java(working copy)
@@ -1381,7 +1381,12 @@
 m.put("ruby", new RubyResponseWriter());
 m.put("raw", new RawResponseWriter());
 m.put("javabin", new BinaryResponseWriter());
-m.put("velocity", new VelocityResponseWriter());
+try {
+  m.put("velocity", new VelocityResponseWriter());
+}
+catch( Throwable t ) {
+  log.warn("Error initalizing VelocityResponseWriter", t );
+}
 m.put("csv", new CSVResponseWriter());
 DEFAULT_RESPONSE_WRITERS = Collections.unmodifiableMap(m);
   }
{code}

Is all he is talking about...  but I'm not sure how/if we want to deal with the 
error being gobbled...

perhaps something smarter to see if Velocity can be created before trying?





> Solr doesn't work without Velocity on classpath
> ---
>
> Key: SOLR-2588
> URL: https://issues.apache.org/jira/browse/SOLR-2588
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.2
>Reporter: Gunnar Wagenknecht
> Fix For: 3.3
>
>
> In 1.4. it was fine to run Solr without Velocity on the classpath. However, 
> in 3.2. SolrCore won't load because of a hard reference to the Velocity 
> response writer in a static initializer.
> {noformat}
> ... ERROR org.apache.solr.core.CoreContainer - 
> java.lang.NoClassDefFoundError: org/apache/velocity/context/Context
>   at org.apache.solr.core.SolrCore.(SolrCore.java:1447)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap

2011-06-13 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048818#comment-13048818
 ] 

Robert Muir commented on LUCENE-3198:
-

why jump the gun, we can just enable it for linux/64-bit.

if others like freebsd or macos X are tested, then we add those to the list, 
but mmap is a little bit scary to just apply as a blanket default?

in all cases it should be like the current logic: if (XYZ_OS && 64_bit && 
*UNMAP_SUPPORTED*)

> Change default Directory impl on 64bit linux to MMap
> 
>
> Key: LUCENE-3198
> URL: https://issues.apache.org/jira/browse/LUCENE-3198
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Michael McCandless
> Fix For: 3.3, 4.0
>
>
> Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle 
> 1.6.0_21) I see MMapDir getting better search and merge performance when 
> compared to NIOFSDir.
> I think we should fix the default.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath

2011-06-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048815#comment-13048815
 ] 

Uwe Schindler commented on SOLR-2588:
-

I generally also do not use the webapp directly, so thats not uncommon!

> Solr doesn't work without Velocity on classpath
> ---
>
> Key: SOLR-2588
> URL: https://issues.apache.org/jira/browse/SOLR-2588
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.2
>Reporter: Gunnar Wagenknecht
> Fix For: 3.3
>
>
> In 1.4. it was fine to run Solr without Velocity on the classpath. However, 
> in 3.2. SolrCore won't load because of a hard reference to the Velocity 
> response writer in a static initializer.
> {noformat}
> ... ERROR org.apache.solr.core.CoreContainer - 
> java.lang.NoClassDefFoundError: org/apache/velocity/context/Context
>   at org.apache.solr.core.SolrCore.(SolrCore.java:1447)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Facet path

2011-06-13 Thread Martijn v Groningen

There is already an issue open for this:
LUCENE-3079

As the issues describes, the faceting in Solr relies on the schema (and off
course the UIF).
So having the noting of a FieldType in the facet module would be very
helpful for selecting the right facet implementation.
Currently in Solr there is only one facet method for field facet that work
per-segment,
but I think in the end we would want all facet types and methods to work on
a per-segment basis.

Martijn

On 13 June 2011 23:47, Jason Rutherglen  wrote:

> I think it's a better approach than rewriting Solr's internals.  Eg,
> small development steps could be taken, using the knowledge learned
> from Solr's facet system.  Eg, caching and intersecting bit sets would
> be an easy-ish first step?
>
> On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer
>  wrote:
> > I believe people are already looking into that but I am not sure.
> > sounds reasonable to me but I think its going to be lots of work
> >
> > simon
> >
> > On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen
> >  wrote:
> >> Are we going the direction of creating full facet features outside of
> >> Solr?  Eg, we have UIF extrapolated out, we can probably make a module
> >> for bit set intersections as well.  In the process the faceting will
> >> go per-segment.
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath

2011-06-13 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048814#comment-13048814
 ] 

Mark Miller commented on SOLR-2588:
---

Perhaps he is not using the webapp?

> Solr doesn't work without Velocity on classpath
> ---
>
> Key: SOLR-2588
> URL: https://issues.apache.org/jira/browse/SOLR-2588
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.2
>Reporter: Gunnar Wagenknecht
> Fix For: 3.3
>
>
> In 1.4. it was fine to run Solr without Velocity on the classpath. However, 
> in 3.2. SolrCore won't load because of a hard reference to the Velocity 
> response writer in a static initializer.
> {noformat}
> ... ERROR org.apache.solr.core.CoreContainer - 
> java.lang.NoClassDefFoundError: org/apache/velocity/context/Context
>   at org.apache.solr.core.SolrCore.(SolrCore.java:1447)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap

2011-06-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048812#comment-13048812
 ] 

Uwe Schindler commented on LUCENE-3198:
---

That means we can now enable MMap for all 64 bit platforms? Solaris, windows, 
Linux - any others except FreeBSD? FreeBSD needs to be checked, but I assume 
its also faster there. We can check on lucene.zones maybe.

> Change default Directory impl on 64bit linux to MMap
> 
>
> Key: LUCENE-3198
> URL: https://issues.apache.org/jira/browse/LUCENE-3198
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Michael McCandless
> Fix For: 3.3, 4.0
>
>
> Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle 
> 1.6.0_21) I see MMapDir getting better search and merge performance when 
> compared to NIOFSDir.
> I think we should fix the default.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Facet path

2011-06-13 Thread Jason Rutherglen

I think it's a better approach than rewriting Solr's internals.  Eg,
small development steps could be taken, using the knowledge learned
from Solr's facet system.  Eg, caching and intersecting bit sets would
be an easy-ish first step?

On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer
 wrote:
> I believe people are already looking into that but I am not sure.
> sounds reasonable to me but I think its going to be lots of work
>
> simon
>
> On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen
>  wrote:
>> Are we going the direction of creating full facet features outside of
>> Solr?  Eg, we have UIF extrapolated out, we can probably make a module
>> for bit set intersections as well.  In the process the faceting will
>> go per-segment.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath

2011-06-13 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048799#comment-13048799
 ] 

Hoss Man commented on SOLR-2588:


I don't understand this bug?

in SOLR-1957 the velocity response writer was promoted from being a contrib to 
being part of the solr core so that the jars are all included in the solr.war 
and the velocity writer would be one of the writers provided by deault.

nothing special should be needed on the classpath.

> Solr doesn't work without Velocity on classpath
> ---
>
> Key: SOLR-2588
> URL: https://issues.apache.org/jira/browse/SOLR-2588
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 3.2
>Reporter: Gunnar Wagenknecht
> Fix For: 3.3
>
>
> In 1.4. it was fine to run Solr without Velocity on the classpath. However, 
> in 3.2. SolrCore won't load because of a hard reference to the Velocity 
> response writer in a static initializer.
> {noformat}
> ... ERROR org.apache.solr.core.CoreContainer - 
> java.lang.NoClassDefFoundError: org/apache/velocity/context/Context
>   at org.apache.solr.core.SolrCore.(SolrCore.java:1447)
>   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
>   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Facet path

2011-06-13 Thread Simon Willnauer

I believe people are already looking into that but I am not sure.
sounds reasonable to me but I think its going to be lots of work

simon

On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen
 wrote:
> Are we going the direction of creating full facet features outside of
> Solr?  Eg, we have UIF extrapolated out, we can probably make a module
> for bit set intersections as well.  In the process the faceting will
> go per-segment.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1

2011-06-13 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3196.
-

Resolution: Fixed

Committed in revision 1135293.


> Optimize FixedStraightBytes for bytes size == 1
> ---
>
> Key: LUCENE-3196
> URL: https://issues.apache.org/jira/browse/LUCENE-3196
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3196.patch
>
>
> Currently we read all the bytes in a PagedBytes instance wich is unneeded for 
> single byte values like norms. For fast access this should simply be a 
> straight array.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Lucene Facet path

2011-06-13 Thread Jason Rutherglen

Are we going the direction of creating full facet features outside of
Solr?  Eg, we have UIF extrapolated out, we can probably make a module
for bit set intersections as well.  In the process the faceting will
go per-segment.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1

2011-06-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048787#comment-13048787
 ] 

Uwe Schindler commented on LUCENE-3196:
---

Looks fine, using this approach, separate norms impl can hopefully go away 
quite fast *g*

For the PreFlex codec I even have an idea for the codec and backwards 
compatibility: The old norms file could be exposed as standard DocValues field 
by PreFlex codec. The r/w StandardCodec would never write separate norms files, 
instead simply write docvalues using this 1 byte approach (of course 
configureable to have e.g. read float norms, and other additional BM25 
statistics or whatever).

Just ideas, Uwe

> Optimize FixedStraightBytes for bytes size == 1
> ---
>
> Key: LUCENE-3196
> URL: https://issues.apache.org/jira/browse/LUCENE-3196
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3196.patch
>
>
> Currently we read all the bytes in a PagedBytes instance wich is unneeded for 
> single byte values like norms. For fast access this should simply be a 
> straight array.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1

2011-06-13 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048757#comment-13048757
 ] 

Simon Willnauer commented on LUCENE-3196:
-

I am planning to commit this soon if nobody objects.

> Optimize FixedStraightBytes for bytes size == 1
> ---
>
> Key: LUCENE-3196
> URL: https://issues.apache.org/jira/browse/LUCENE-3196
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3196.patch
>
>
> Currently we read all the bytes in a PagedBytes instance wich is unneeded for 
> single byte values like norms. For fast access this should simply be a 
> straight array.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: commit-check target for ant?

2011-06-13 Thread Chris Hostetter


: Ok, I get your point and I'm not going to force it, but I don't agree
: people still running 1.5 should be able to compile from sources. I
: mean: 1.5 has been dead for a longer while now; the same argument
: could be made for java 1.4 or whatever most recent version has been

that's an argument for changing our compatibility requirement to 1.6 -- i 
don't object to having that argument (not sure how i feel about the actual 
idea given the licensing hubub and lucene's nature as a *library* that 
lots of people embed in lots of apps - but i digress) but i don't see that 
as legitimate agrument in favor of having higher overhead for compiling 
then for running.

for this discussion, it shouldn't matter what java version we are talking 
about, we could have the same argument about requiring 1.7 to compile but 
supporting binary releases that run on 1.6; or an argument about wether we 
should use a commercial tool that commiters have a license for to "build" 
java code from a source grammer -- the point is that as an open source 
project i think it's really important that *all* our users be allowed to 
compile from "source".



-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2586) example work & logs directories needed?

2011-06-13 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048755#comment-13048755
 ] 

Uwe Schindler commented on SOLR-2586:
-

bq. Bottom line I think, is if someone wants to ensure that Solr works well on 
Tomcat for example, then they should make a patch so that our tests test with 
this container too (e.g. in hudson, etc). Once its baked in hudson for a while, 
then I would say its easy for us to recommend it, too.

That is the hardest task. Jetty is so cool, because it can be used and 
configured "embedded". To start up Tomcat, you have to provide final 
configuration files in the default folder layout and start a main() static 
method from a class. Something so easy like jettyServer.addServletFilter() & 
similar things are not possible with Tomcat out of the box. This makes Jetty 
(in my opinion) the best servlet container around. I sometimes also use it that 
way (embedded in my Java app).

> example work & logs directories needed?
> ---
>
> Key: SOLR-2586
> URL: https://issues.apache.org/jira/browse/SOLR-2586
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Reporter: David Smiley
>Priority: Minor
>
> Firstly, what prompted this issue was me wanting to use a git solr mirror but 
> finding that git's lack of empty-directory support made the "example" ant 
> task fail. This task requires examples/work to be in place so that it can 
> delete its contents. Fixing this was a simple matter of adding:
> {code:xml}
> 
> {code}
> Right before the delete task.
> But then it occurred to me, why even have a "work" directory since Jetty will 
> apparently use a temp directory instead. -- try for yourself (stdout snippet):
> bq. 2011-06-11 00:51:26.177:INFO::Extract 
> file:/SmileyDev/Search/lucene-solr/solr/example/webapps/solr.war to 
> /var/folders/zo/zoQJvqc9E0076p0THiri+k+++TI/-Tmp-/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp
> On my Mac, this same directory was used for multiple runs, so somehow Jetty 
> or the VM figures out how to reuse it.
> Since this "example" setup isn't a *real* installation -- it's just for 
> demonstration, arguably it should not contain what it doesn't need.  
> Likewise, perhaps the empty example/logs directory should be deleted. It's 
> not used by default any way.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Indexing slower in trunk

2011-06-13 Thread Simon Willnauer

On Mon, Jun 13, 2011 at 8:13 PM, Erick Erickson  wrote:
> I half remember that this has come up before, but I couldn't find the
> thread. I was running some tests over the weekend that involved
> indexing 1.9M documents from the English Wiki dump.
>
> I'm consistently seeing that trunk takes about twice as long to index
> the docs as 1.4, 3.2 and 3x. Optimize is also taking quite a bit
> longer I admit that these aren't very sophisticated tests, and I only
> ran the trunk process twice (although both those were consistent).
>
> I'm pretty sure my rambuffersize and autocommit settings are
> identical. I remove the data/index directory before each run. These
> results are running the indexing program in IntelliJ, on my Mac, both
> the server and the indexing programs were running locally.
>
> No, trunk isn't compiling before running .
>
> Here's the server definition:
> new StreamingUpdateSolrServer(url, 10, 4);
>
> and I'm batching up the documents and sending them to Solr in batches of 
> 1,000.
>
> So, my question is whether this should be pursued. Note that I'm still
> getting around 3K docs/second, which I can't complain about. Not that
> that stops me, you understand. And in return for a memory footprint
> reduction from 389M to 90M after some off-the-wall sorting and
> faceting I'll take it!
>
> H, speaking of which, the memory usage changes seem like a good
> candidate for a page on the Wiki, anyone want to suggest a home?
>
>
> Solr 1.4.1
> Total Time Taken-> 257 seconds
> Total documents added-> 1917728
> Docs/sec-> 7461
> starting optimize
> optimizing took 26 seconds
>
> Solr 3.2
> Total Time Taken-> 243 seconds
> Total documents added-> 1917728
> Docs/sec-> 7891
> starting optimize
> optimizing took 21 seconds
>
> Solr 3x
> Total Time Taken-> 269 seconds
> Total documents added-> 1917728
> Docs/sec-> 7129
> starting optimize
> optimizing took 21 seconds
>
> Solr trunk. 2011-6-11: 17:24 EST
> Total Time Taken-> 592 seconds
> Total documents added-> 1917728
> Docs/sec-> 3239
> starting optimize
> optimizing took 159 seconds
>
> What do folks think? Is there anything I can/should do to narrow this down?

Hi Eric,

this looks weird, I have some questions:

- you are indexing into the same disk as you read the data from?
- what are you rambuffer settings?
- how many threads are you using to send data to solr?
- what is your autocommit setting?

simon


>
> Erick
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2586) example work & logs directories needed?

2011-06-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048746#comment-13048746
 ] 

Yonik Seeley commented on SOLR-2586:


bq. why even have a "work" directory since Jetty will apparently use a temp 
directory instead.

Some things have reasons that we barely remember ;-)  In this case, I think the 
main motivating factor might have been SOLR-118
I remember a number of people reporting failing JSPs over time, and it took 
quite a while to track it down.


> example work & logs directories needed?
> ---
>
> Key: SOLR-2586
> URL: https://issues.apache.org/jira/browse/SOLR-2586
> Project: Solr
>  Issue Type: Improvement
>  Components: Build
>Reporter: David Smiley
>Priority: Minor
>
> Firstly, what prompted this issue was me wanting to use a git solr mirror but 
> finding that git's lack of empty-directory support made the "example" ant 
> task fail. This task requires examples/work to be in place so that it can 
> delete its contents. Fixing this was a simple matter of adding:
> {code:xml}
> 
> {code}
> Right before the delete task.
> But then it occurred to me, why even have a "work" directory since Jetty will 
> apparently use a temp directory instead. -- try for yourself (stdout snippet):
> bq. 2011-06-11 00:51:26.177:INFO::Extract 
> file:/SmileyDev/Search/lucene-solr/solr/example/webapps/solr.war to 
> /var/folders/zo/zoQJvqc9E0076p0THiri+k+++TI/-Tmp-/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp
> On my Mac, this same directory was used for multiple runs, so somehow Jetty 
> or the VM figures out how to reuse it.
> Since this "example" setup isn't a *real* installation -- it's just for 
> demonstration, arguably it should not contain what it doesn't need.  
> Likewise, perhaps the empty example/logs directory should be deleted. It's 
> not used by default any way.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3097) Post grouping faceting

2011-06-13 Thread Martijn van Groningen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-3097:
--

Attachment: LUCENE-3097.patch

An updated version of the patch. This is still work in progress.

I basically rewrote the code in the same way as the other collectors were 
rewritten for LUCENE-3099.

Things todo are creating tests and add some more documentation. This patch only 
covers the second facet / grouping method. 

> Post grouping faceting
> --
>
> Key: LUCENE-3097
> URL: https://issues.apache.org/jira/browse/LUCENE-3097
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: modules/grouping
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Fix For: 3.3
>
> Attachments: LUCENE-3097.patch, LUCENE-3097.patch
>
>
> This issues focuses on implementing post grouping faceting.
> * How to handle multivalued fields. What field value to show with the facet.
> * Where the facet counts should be based on
> ** Facet counts can be based on the normal documents. Ungrouped counts. 
> ** Facet counts can be based on the groups. Grouped counts.
> ** Facet counts can be based on the combination of group value and facet 
> value. Matrix counts.   
> And properly more implementation options.
> The first two methods are implemented in the SOLR-236 patch. For the first 
> option it calculates a DocSet based on the individual documents from the 
> query result. For the second option it calculates a DocSet for all the most 
> relevant documents of a group. Once the DocSet is computed the FacetComponent 
> and StatsComponent use one the DocSet to create facets and statistics.  
> This last one is a bit more complex. I think it is best explained with an 
> example. Lets say we search on travel offers:
> |||hotel||departure_airport||duration||
> |Hotel a|AMS|5
> |Hotel a|DUS|10
> |Hotel b|AMS|5
> |Hotel b|AMS|10
> If we group by hotel and have a facet for airport. Most end users expect 
> (according to my experience off course) the following airport facet:
> AMS: 2
> DUS: 1
> The above result can't be achieved by the first two methods. You either get 
> counts AMS:3 and DUS:1 or 1 for both airports.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Archive questions

2011-06-13 Thread Itamar Syn-Hershko


http://mail-archives.apache.org/mod_mbox/lucene-openrelevance-dev/

On 13/06/2011 18:04, Patrick Durusau wrote:


Itamar,

Sorry to reply to my own answer but yes, the 1969-12-31 date is 
obviously a software glitch.


But, the wiki says the project started in June 2009.

So, my question is where are the email archives between June 2009 and 
September 2010?


There may not be any but would be an answer.

Hope you are at the start of a great week!

Patrick

On 06/12/2011 03:59 PM, Patrick Durusau wrote:

Itamar,

Thanks!

Hope you are having a great weekend!

Patrick

On 6/12/2011 3:47 PM, Itamar Syn-Hershko wrote:

Hi,

On 12/06/2011 22:42, Patrick Durusau wrote:


Questions:

1) The message: 
http://www.lucidimagination.com/search/document/4e91498fae518260/orp_newbie


displays "Date: 1969-12-31" but the text of the message says: "

On Tue, Sep 21, 2010 at 7:37 AM, Tommaso Teofili"

I was under the impression this project started in 2009? Shouldn't 
the email archives start in 2009?

Thats probably a software glitch. The project is fairly new.



2) The message mentioned above makes reference to the OpenRelevance 
Viewer:


http://www.lucidimagination.com/search/out?u=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FORP%2FOpen%2BRelevance%2BViewer%29 



That returns a page not found message.


https://cwiki.apache.org/confluence/display/ORP/Open+Relevance+Viewer

RE: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread karl.wright

Congratulations, Jan!
Karl

-Original Message-
From: ext Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Monday, June 13, 2011 10:43 AM
To: dev@lucene.apache.org
Subject: Welcome Jan Høydahl as Lucene/Solr committer

I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as our 
newest committer.

Jan, if you don't mind, could you introduce yourself with a brief bio as has 
become our tradition?

Congratulations and welcome aboard!

- Mark Miller
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Shai Erera

Welcome Jan !

Shai

On Mon, Jun 13, 2011 at 10:25 PM, Dawid Weiss
wrote:

> Welcome Jan!
>
> On Mon, Jun 13, 2011 at 6:44 PM, Shalin Shekhar Mangar
>  wrote:
> > Welcome Jan!
> >
> > On Mon, Jun 13, 2011 at 8:13 PM, Mark Miller 
> wrote:
> >>
> >> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl
> as
> >> our newest committer.
> >>
> >> Jan, if you don't mind, could you introduce yourself with a brief bio as
> >> has become our tradition?
> >>
> >> Congratulations and welcome aboard!
> >>
> >>
> >> - Mark Miller
> >> lucidimagination.com
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >
> >
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Updated] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap

2011-06-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3198:
---

  Component/s: core/store
Fix Version/s: 4.0
   3.3

> Change default Directory impl on 64bit linux to MMap
> 
>
> Key: LUCENE-3198
> URL: https://issues.apache.org/jira/browse/LUCENE-3198
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/store
>Reporter: Michael McCandless
> Fix For: 3.3, 4.0
>
>
> Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle 
> 1.6.0_21) I see MMapDir getting better search and merge performance when 
> compared to NIOFSDir.
> I think we should fix the default.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap

2011-06-13 Thread Michael McCandless (JIRA)

Change default Directory impl on 64bit linux to MMap


 Key: LUCENE-3198
 URL: https://issues.apache.org/jira/browse/LUCENE-3198
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless


Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle 1.6.0_21) 
I see MMapDir getting better search and merge performance when compared to 
NIOFSDir.

I think we should fix the default.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Dawid Weiss

Welcome Jan!

On Mon, Jun 13, 2011 at 6:44 PM, Shalin Shekhar Mangar
 wrote:
> Welcome Jan!
>
> On Mon, Jun 13, 2011 at 8:13 PM, Mark Miller  wrote:
>>
>> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as
>> our newest committer.
>>
>> Jan, if you don't mind, could you introduce yourself with a brief bio as
>> has become our tradition?
>>
>> Congratulations and welcome aboard!
>>
>>
>> - Mark Miller
>> lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-2341) explore morfologik integration

2011-06-13 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reassigned LUCENE-2341:
---

Assignee: Dawid Weiss

> explore morfologik integration
> --
>
> Key: LUCENE-2341
> URL: https://issues.apache.org/jira/browse/LUCENE-2341
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: modules/analysis
>Reporter: Robert Muir
>Assignee: Dawid Weiss
>
> Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer 
> available:
> http://sourceforge.net/projects/morfologik/
> This works differently than LUCENE-2298, and ideally would be another option 
> for users.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

2011-06-13 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3197:
---

Component/s: core/index

> Optimize runs forever if you keep deleting docs at the same time
> 
>
> Key: LUCENE-3197
> URL: https://issues.apache.org/jira/browse/LUCENE-3197
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: core/index
>Reporter: Michael McCandless
>Priority: Minor
> Fix For: 3.3, 4.0
>
>
> Because we "cascade" merges for an optimize... if you also delete documents 
> while the merges are running, then the merge policy will see the resulting 
> single segment as still not optimized (since it has pending deletes) and do a 
> single-segment merge, and will repeat indefinitely (as long as your app keeps 
> deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time

2011-06-13 Thread Michael McCandless (JIRA)

Optimize runs forever if you keep deleting docs at the same time


 Key: LUCENE-3197
 URL: https://issues.apache.org/jira/browse/LUCENE-3197
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.3, 4.0


Because we "cascade" merges for an optimize... if you also delete documents 
while the merges are running, then the merge policy will see the resulting 
single segment as still not optimized (since it has pending deletes) and do a 
single-segment merge, and will repeat indefinitely (as long as your app keeps 
deleting docs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Created] (LUCENENET-425) MMapDirectory implementation

2011-06-13 Thread Digy (JIRA)

MMapDirectory implementation


 Key: LUCENENET-425
 URL: https://issues.apache.org/jira/browse/LUCENENET-425
 Project: Lucene.Net
  Issue Type: New Feature
Affects Versions: Lucene.Net 2.9.4g
Reporter: Digy
Priority: Trivial
 Fix For: Lucene.Net 2.9.4g
 Attachments: MMapDirectory.patch

Since this is not a direct port of MMapDirectory.java, I'll put it under 
"Support" and implement MMapDirectory as 
{code}
public class 
MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory:Lucene.Net.Support.MemoryMappedDirectory
{
}
{code}
If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 bit 
address range), it will default to FSDirectory.FSIndexInput

In my tests, I didn't see any performance gain in 32bit environment and I 
consider it as better then nothing. 

I would be happy if someone could send test results on 64bit platform.

DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[Lucene.Net] [jira] [Updated] (LUCENENET-425) MMapDirectory implementation

2011-06-13 Thread Digy (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENENET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-425:
---

Attachment: MMapDirectory.patch

> MMapDirectory implementation
> 
>
> Key: LUCENENET-425
> URL: https://issues.apache.org/jira/browse/LUCENENET-425
> Project: Lucene.Net
>  Issue Type: New Feature
>Affects Versions: Lucene.Net 2.9.4g
>Reporter: Digy
>Priority: Trivial
> Fix For: Lucene.Net 2.9.4g
>
> Attachments: MMapDirectory.patch
>
>
> Since this is not a direct port of MMapDirectory.java, I'll put it under 
> "Support" and implement MMapDirectory as 
> {code}
> public class 
> MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory:Lucene.Net.Support.MemoryMappedDirectory
> {
> }
> {code}
> If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 
> bit address range), it will default to FSDirectory.FSIndexInput
> In my tests, I didn't see any performance gain in 32bit environment and I 
> consider it as better then nothing. 
> I would be happy if someone could send test results on 64bit platform.
> DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (LUCENE-3193) TwoPhaseCommit interface

2011-06-13 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-3193.


Resolution: Fixed

Committed revision 1135204 (trunk).
Committed revision 1135215 (3x).

> TwoPhaseCommit interface
> 
>
> Key: LUCENE-3193
> URL: https://issues.apache.org/jira/browse/LUCENE-3193
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/index
>Reporter: Shai Erera
>Assignee: Shai Erera
> Fix For: 3.3, 4.0
>
> Attachments: LUCENE-3193.patch, LUCENE-3193.patch
>
>
> I would like to propose a TwoPhaseCommit interface which declares the methods 
> necessary to implement a 2-phase commit algorithm:
> * prepareCommit()
> * commit()
> * rollback()
> The prepare/commit ones have variants that take a (Map 
> commitData) following the ones we have in IndexWriter.
> In addition, a TwoPhaseCommitTool which implements a 2-phase commit amongst 
> several TPCs.
> Having IndexWriter implement that interface will allow running the 2-phase 
> commit algorithm on multiple IWs or IW + any other object that implements the 
> interface.
> We should mark the interface @lucene.internal so as to not block ourselves in 
> the future. This is pretty advanced stuff anyway.
> Will post a patch soon

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete

2011-06-13 Thread Steven Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe resolved SOLR-2590.
---

Resolution: Fixed

Committed:
- r1135206: trunk
- r1135207: branch_3x


> javadoc.link.lucene property value in solr/common-build.xml is obsolete
> ---
>
> Key: SOLR-2590
> URL: https://issues.apache.org/jira/browse/SOLR-2590
> Project: Solr
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 4.0
>Reporter: Steven Rowe
>Assignee: Steven Rowe
>Priority: Minor
> Fix For: 3.3, 4.0
>
> Attachments: SOLR-2590.patch
>
>
> The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target 
> no longer works.
> From 
> https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText :
> {noformat}
>   [javadoc] javadoc: warning - Error fetching URL: 
> https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list
> ...
> BUILD FAILED
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213:
>  The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389:
>  Javadocs warnings were found!
> {noformat}
> The link should instead be 
> https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Indexing slower in trunk

2011-06-13 Thread Erick Erickson

I half remember that this has come up before, but I couldn't find the
thread. I was running some tests over the weekend that involved
indexing 1.9M documents from the English Wiki dump.

I'm consistently seeing that trunk takes about twice as long to index
the docs as 1.4, 3.2 and 3x. Optimize is also taking quite a bit
longer I admit that these aren't very sophisticated tests, and I only
ran the trunk process twice (although both those were consistent).

I'm pretty sure my rambuffersize and autocommit settings are
identical. I remove the data/index directory before each run. These
results are running the indexing program in IntelliJ, on my Mac, both
the server and the indexing programs were running locally.

No, trunk isn't compiling before running .

Here's the server definition:
new StreamingUpdateSolrServer(url, 10, 4);

and I'm batching up the documents and sending them to Solr in batches of 1,000.

So, my question is whether this should be pursued. Note that I'm still
getting around 3K docs/second, which I can't complain about. Not that
that stops me, you understand. And in return for a memory footprint
reduction from 389M to 90M after some off-the-wall sorting and
faceting I'll take it!

H, speaking of which, the memory usage changes seem like a good
candidate for a page on the Wiki, anyone want to suggest a home?


Solr 1.4.1
Total Time Taken-> 257 seconds
Total documents added-> 1917728
Docs/sec-> 7461
starting optimize
optimizing took 26 seconds

Solr 3.2
Total Time Taken-> 243 seconds
Total documents added-> 1917728
Docs/sec-> 7891
starting optimize
optimizing took 21 seconds

Solr 3x
Total Time Taken-> 269 seconds
Total documents added-> 1917728
Docs/sec-> 7129
starting optimize
optimizing took 21 seconds

Solr trunk. 2011-6-11: 17:24 EST
Total Time Taken-> 592 seconds
Total documents added-> 1917728
Docs/sec-> 3239
starting optimize
optimizing took 159 seconds

What do folks think? Is there anything I can/should do to narrow this down?

Erick

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete

2011-06-13 Thread Steven Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated SOLR-2590:
--

Attachment: SOLR-2590.patch

Patch for trunk with the fixed Jenkins-built Lucene trunk javadocs URL.

Locally, "ant javadoc" under solr/ succeeds.

Committing shortly.

> javadoc.link.lucene property value in solr/common-build.xml is obsolete
> ---
>
> Key: SOLR-2590
> URL: https://issues.apache.org/jira/browse/SOLR-2590
> Project: Solr
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 4.0
>Reporter: Steven Rowe
>Assignee: Steven Rowe
>Priority: Minor
> Fix For: 3.3, 4.0
>
> Attachments: SOLR-2590.patch
>
>
> The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target 
> no longer works.
> From 
> https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText :
> {noformat}
>   [javadoc] javadoc: warning - Error fetching URL: 
> https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list
> ...
> BUILD FAILED
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213:
>  The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389:
>  Javadocs warnings were found!
> {noformat}
> The link should instead be 
> https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete

2011-06-13 Thread Steven Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048676#comment-13048676
 ] 

Steven Rowe commented on SOLR-2590:
---

If you paste the old link into a browser, you get redirected to 
https://builds.apache.org/ - I guess the admins got tired of supporting the old 
Hudson links?

> javadoc.link.lucene property value in solr/common-build.xml is obsolete
> ---
>
> Key: SOLR-2590
> URL: https://issues.apache.org/jira/browse/SOLR-2590
> Project: Solr
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 4.0
>Reporter: Steven Rowe
>Assignee: Steven Rowe
>Priority: Minor
> Fix For: 3.3, 4.0
>
>
> The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target 
> no longer works.
> From 
> https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText :
> {noformat}
>   [javadoc] javadoc: warning - Error fetching URL: 
> https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list
> ...
> BUILD FAILED
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213:
>  The following error occurred while executing this line:
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389:
>  Javadocs warnings were found!
> {noformat}
> The link should instead be 
> https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete

2011-06-13 Thread Steven Rowe (JIRA)

javadoc.link.lucene property value in solr/common-build.xml is obsolete
---

 Key: SOLR-2590
 URL: https://issues.apache.org/jira/browse/SOLR-2590
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Steven Rowe
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.3, 4.0


The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target 
no longer works.

>From 
>https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText :

{noformat}
  [javadoc] javadoc: warning - Error fetching URL: 
https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list
...
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389:
 Javadocs warnings were found!
{noformat}

The link should instead be 
https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: repo doesn't like me this morning.

2011-06-13 Thread Erick Erickson

I'm getting updates again, apparently the problem is fixed.

On Mon, Jun 13, 2011 at 12:36 PM, Upayavira  wrote:
> Is the EU one getting updates? I've seen a suggestion that commits from
> the US aren't getting to it.
>
> Upayavira
>
> On Mon, 13 Jun 2011 10:05 -0400, "Robert Muir"  wrote:
>> Yeah, but you can svn update/checkout/make patches etc until the main
>> one comes back online, then switch back for committing
>>
>> On Mon, Jun 13, 2011 at 10:03 AM, Uwe Schindler  wrote:
>> > The European one works read only. Committing of course also fails.
>> >
>> > Uwe
>> > --
>> > Uwe Schindler
>> > H.-H.-Meier-Allee 63, 28213 Bremen
>> > http://www.thetaphi.de
>> >
>> >
>> >
>> > Robert Muir  schrieb:
>> >>
>> >> svn switch --relocate
>> >> https://svn.apache.org/repos/asf/lucene/dev/trunk
>> >> https://svn.eu.apache.org/repos/asf/lucene/dev/trunk
>> >>
>> >> On Mon, Jun 13, 2011 at 9:55 AM, Uwe Schindler  wrote:
>> >> > It's currently broken. See monitoring page.
>> >> >
>> >> > Uwe
>> >> > --
>> >> > Uwe Schindler
>> >> > H.-H.-Meier-Allee 63, 28213 Bremen
>> >> > http://www.thetaphi.de
>> >> >
>> >> >
>> >> >
>> >> > Erick Erickson  schrieb:
>> >> >>
>> >> >> Trying to do a simple "svn update" on the trunk gives me an "svn:
>> >> >> access to '
>> >> http://svn.apache.org/repos/asf/lucene/dev/trunk<
>> >>  /a>'
>> >> >> forbidden" error. This if from the shell on OS X. I was able to check
>> >> >> out the trunk last night
>> >> >>
>> >> >> Are others seeing this or am I special (perhaps part way through
>> >> >> getting full rights)? Or do I have to authenticate now? It's no huge
>> >> >> deal, I'm not in a particular hurry if it's just me, but if it's
>> >> >> everybody it's more serious...
>> >> >>
>> >> >> Thanks,
>> >> >> Erick
>> >> >>
>> >> >>
>> >> 
>> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >> >> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >> >>
>> >> >>
>> >> >
>> >>
>> >> 
>> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> >> For additional commands, e-mail: dev-h...@lucene.apache.org
>> >>
>> >>
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (SOLR-2574) Add SLF4J-nop dependency

2011-06-13 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reopened SOLR-2574:
-

  Assignee: Shalin Shekhar Mangar

slf4j in branch 3x still needs to be updated to 1.6

Thanks for reminding Gabriele :)

> Add SLF4J-nop dependency
> 
>
> Key: SOLR-2574
> URL: https://issues.apache.org/jira/browse/SOLR-2574
> Project: Solr
>  Issue Type: Bug
>Reporter: Gabriele Kahlout
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
> Fix For: 4.0
>
> Attachments: solrjtest.zip
>
>
> Whatever the merits of slf4j, a quick solrj test should work. 
> I've attached a sample 1-line project with dependency on solrj-3.2 on run it 
> prints:
> {code}
> java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
>   at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.(CommonsHttpSolrServer.java:72)
>   at com.mysimpatico.solrjtest.App.main(App.java:12)
> {code}
> Uncomment the nop dependency and it will work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2574) Add SLF4J-nop dependency

2011-06-13 Thread Gabriele Kahlout (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048666#comment-13048666
 ] 

Gabriele Kahlout commented on SOLR-2574:


thank you, what about 3_x branch?

> Add SLF4J-nop dependency
> 
>
> Key: SOLR-2574
> URL: https://issues.apache.org/jira/browse/SOLR-2574
> Project: Solr
>  Issue Type: Bug
>Reporter: Gabriele Kahlout
>Priority: Minor
> Fix For: 4.0
>
> Attachments: solrjtest.zip
>
>
> Whatever the merits of slf4j, a quick solrj test should work. 
> I've attached a sample 1-line project with dependency on solrj-3.2 on run it 
> prints:
> {code}
> java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
>   at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.(CommonsHttpSolrServer.java:72)
>   at com.mysimpatico.solrjtest.App.main(App.java:12)
> {code}
> Uncomment the nop dependency and it will work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-2551) Checking dataimport.properties for write access during startup

2011-06-13 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar reassigned SOLR-2551:
---

Assignee: Shalin Shekhar Mangar

> Checking dataimport.properties for write access during startup
> --
>
> Key: SOLR-2551
> URL: https://issues.apache.org/jira/browse/SOLR-2551
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4.1, 3.1
>Reporter: C S
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> A common mistake is that the /conf (respectively the dataimport.properties) 
> file is not writable for solr. It would be great if that were detected on 
> starting a dataimport job. 
> Currently and import might grind away for days and fail if it can't write its 
> timestamp to the dataimport.properties file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2551) Checking dataimport.properties for write access during startup

2011-06-13 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048662#comment-13048662
 ] 

Shalin Shekhar Mangar commented on SOLR-2551:
-

If DIH is unable to write dataimport.properties, it logs a message saying so. 
We don't want the import to fail in this case because a lot of people use only 
full-imports which does not need the dataimport.properties at all. What do you 
suggest?

> Checking dataimport.properties for write access during startup
> --
>
> Key: SOLR-2551
> URL: https://issues.apache.org/jira/browse/SOLR-2551
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - DataImportHandler
>Affects Versions: 1.4.1, 3.1
>Reporter: C S
>Priority: Minor
>
> A common mistake is that the /conf (respectively the dataimport.properties) 
> file is not writable for solr. It would be great if that were detected on 
> starting a dataimport job. 
> Currently and import might grind away for days and fail if it can't write its 
> timestamp to the dataimport.properties file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8811 - Failure

2011-06-13 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/

All tests passed

Build Log (for compile errors):
[...truncated 13695 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2574) Add SLF4J-nop dependency

2011-06-13 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-2574.
-

   Resolution: Fixed
Fix Version/s: (was: 1.4)
   4.0

slf4j is v1.6.1 in trunk

> Add SLF4J-nop dependency
> 
>
> Key: SOLR-2574
> URL: https://issues.apache.org/jira/browse/SOLR-2574
> Project: Solr
>  Issue Type: Bug
>Reporter: Gabriele Kahlout
>Priority: Minor
> Fix For: 4.0
>
> Attachments: solrjtest.zip
>
>
> Whatever the merits of slf4j, a quick solrj test should work. 
> I've attached a sample 1-line project with dependency on solrj-3.2 on run it 
> prints:
> {code}
> java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
>   at 
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.(CommonsHttpSolrServer.java:72)
>   at com.mysimpatico.solrjtest.App.main(App.java:12)
> {code}
> Uncomment the nop dependency and it will work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2136) Function Queries: if() function

2011-06-13 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048646#comment-13048646
 ] 

Yonik Seeley commented on SOLR-2136:


Thanks Koji, I just committed a fix for this cut'n'paste error.

> Function Queries: if() function
> ---
>
> Key: SOLR-2136
> URL: https://issues.apache.org/jira/browse/SOLR-2136
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4.1
>Reporter: Jan Høydahl
> Fix For: 4.0
>
> Attachments: SOLR-2136.patch, SOLR-2136.patch
>
>
> Add an if() function which will enable conditional function queries.
> The function could be modeled after a spreadsheet if function (e.g: 
> http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_IF_function)
> IF(test; value1; value2) where:
> test is or refers to a logical value or expression that returns a logical 
> value (TRUE or FALSE).
> value1 is the value that is returned by the function if test yields TRUE.
> value2 is the value that is returned by the function if test yields FALSE.
> If value2 is omitted it is assumed to be FALSE; if value1 is also omitted it 
> is assumed to be TRUE.
> Example use:
> if(color=="red"; 100; if(color=="green"; 50; 25))
> This function will check the document field "color", and if it is "red" 
> return 100, if it is "green" return 50, else return 25.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Shalin Shekhar Mangar

Welcome Jan!

On Mon, Jun 13, 2011 at 8:13 PM, Mark Miller  wrote:

> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as
> our newest committer.
>
> Jan, if you don't mind, could you introduce yourself with a brief bio as
> has become our tradition?
>
> Congratulations and welcome aboard!
>
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 
Regards,
Shalin Shekhar Mangar.

RE: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Steven A Rowe

Welcome!

> -Original Message-
> From: Mark Miller [mailto:markrmil...@gmail.com]
> Sent: Monday, June 13, 2011 10:43 AM
> To: dev@lucene.apache.org
> Subject: Welcome Jan Høydahl as Lucene/Solr committer
> 
> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl
> as our newest committer.
> 
> Jan, if you don't mind, could you introduce yourself with a brief bio as
> has become our tradition?
> 
> Congratulations and welcome aboard!
> 
> 
> - Mark Miller
> lucidimagination.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2136) Function Queries: if() function

2011-06-13 Thread Koji Sekiguchi (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048642#comment-13048642
 ] 

Koji Sekiguchi commented on SOLR-2136:
--

I've met a strange behavior. With empty index, start example solr. Then hit:

http://localhost:8983/solr/select/?q={!func}if(exists(f1_b),10,20)&debug=results

you got an empty xml as expected. Then hit the above URL again, you got the 
following exception:

{code}
SEVERE: java.lang.ClassCastException: 
org.apache.solr.search.ValueSourceParser$60$1 cannot be cast to 
org.apache.solr.search.function.SingleFunction
at 
org.apache.solr.search.function.SimpleBoolFunction.equals(SimpleBoolFunction.java:66)
at 
org.apache.solr.search.function.IfFunction.equals(IfFunction.java:137)
at 
org.apache.solr.search.function.FunctionQuery.equals(FunctionQuery.java:202)
at org.apache.solr.search.QueryResultKey.equals(QueryResultKey.java:78)
at java.util.HashMap.getEntry(HashMap.java:349)
at java.util.LinkedHashMap.get(LinkedHashMap.java:280)
at org.apache.solr.search.LRUCache.get(LRUCache.java:129)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:991)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:346)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:441)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:239)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1308)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{code}


> Function Queries: if() function
> ---
>
> Key: SOLR-2136
> URL: https://issues.apache.org/jira/browse/SOLR-2136
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 1.4.1
>Reporter: Jan Høydahl
> Fix For: 4.0
>
> Attachments: SOLR-2136.patch, SOLR-2136.patch
>
>
> Add an if() function which will enable conditional function queries.
> The function could be modeled after a spreadsheet if function (e.g: 
> http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_IF_function)
> IF(test; value1; value2) where:
> test is or refers to a logical value or expression that returns a logical 
> value (TRUE or FALSE).
> value1 is the value that is returned by the function if test yields TRUE.
> value2 is the value that is returned by the function if test yields FALSE.
> If value2 is omitted it is assumed to be FALSE; if value1 is also omitted it 
> is assumed to be TRUE.
> Example use:
> if(color=="red"; 100; if(color=="green"; 50; 25))
> This function will check the document field "color", and if it is "red" 
> return 100, if it is "green" return 50, else return 25.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: repo doesn't like me this morning.

2011-06-13 Thread Upayavira

Is the EU one getting updates? I've seen a suggestion that commits from
the US aren't getting to it.

Upayavira

On Mon, 13 Jun 2011 10:05 -0400, "Robert Muir"  wrote:
> Yeah, but you can svn update/checkout/make patches etc until the main
> one comes back online, then switch back for committing
> 
> On Mon, Jun 13, 2011 at 10:03 AM, Uwe Schindler  wrote:
> > The European one works read only. Committing of course also fails.
> >
> > Uwe
> > --
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, 28213 Bremen
> > http://www.thetaphi.de
> >
> >
> >
> > Robert Muir  schrieb:
> >>
> >> svn switch --relocate
> >> https://svn.apache.org/repos/asf/lucene/dev/trunk
> >> https://svn.eu.apache.org/repos/asf/lucene/dev/trunk
> >>
> >> On Mon, Jun 13, 2011 at 9:55 AM, Uwe Schindler  wrote:
> >> > It's currently broken. See monitoring page.
> >> >
> >> > Uwe
> >> > --
> >> > Uwe Schindler
> >> > H.-H.-Meier-Allee 63, 28213 Bremen
> >> > http://www.thetaphi.de
> >> >
> >> >
> >> >
> >> > Erick Erickson  schrieb:
> >> >>
> >> >> Trying to do a simple "svn update" on the trunk gives me an "svn:
> >> >> access to '
> >> http://svn.apache.org/repos/asf/lucene/dev/trunk<
> >>  /a>'
> >> >> forbidden" error. This if from the shell on OS X. I was able to check
> >> >> out the trunk last night
> >> >>
> >> >> Are others seeing this or am I special (perhaps part way through
> >> >> getting full rights)? Or do I have to authenticate now? It's no huge
> >> >> deal, I'm not in a particular hurry if it's just me, but if it's
> >> >> everybody it's more serious...
> >> >>
> >> >> Thanks,
> >> >> Erick
> >> >>
> >> >>
> >> 
> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >> >>
> >> >>
> >> >
> >>
> >> 
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >>
> >
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Martijn v Groningen

Welcome Jan!

On 13 June 2011 17:30, Koji Sekiguchi  wrote:

> Welcome!
>
>
> (11/06/13 23:43), Mark Miller wrote:
>
>> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as
>> our newest committer.
>>
>> Jan, if you don't mind, could you introduce yourself with a brief bio as
>> has become our tradition?
>>
>> Congratulations and welcome aboard!
>>
>>
>> - Mark Miller
>> lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>
> --
> http://www.rondhuit.com/en/
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


-- 
Met vriendelijke groet,

Martijn van Groningen

Participation Requested: Survey about Open-Source Software Development

2011-06-13 Thread Jeffrey Carver

Hi,

Drs. Jeffrey Carver, Rosanna Guadagno, Debra McCallum, and Mr. Amiangshu
Bosu,  University of Alabama, and Dr. Lorin Hochstein, University of
Southern California, are conducting a survey of open-source software
developers. This survey seeks to understand how developers on distributed,
virtual teams, like open-source projects, interact with each other to
accomplish their tasks. You must be at least 19 years of age to complete the
survey. The survey should take approximately 15 minutes to complete.

If you are actively participating as a developer, please consider completing
our survey.
 
Here is the link to the survey:   http://goo.gl/HQnux

We apologize for inconvenience and if you receive multiple copies of this
email. This survey has been approved by The University of Alabama IRB board.

Thanks,

Dr. Jeffrey Carver
Assistant Professor
University of Alabama
(v) 205-348-9829  (f) 205-348-0219
http://www.cs.ua.edu/~carver



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2400) FieldAnalysisRequestHandler; add information about token-relation

2011-06-13 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-2400.
-

Resolution: Fixed

Committed trunk revision: 1135154
Committed 3.x branch revision: 1135156

> FieldAnalysisRequestHandler; add information about token-relation
> -
>
> Key: SOLR-2400
> URL: https://issues.apache.org/jira/browse/SOLR-2400
> Project: Solr
>  Issue Type: Improvement
>  Components: Schema and Analysis
>Reporter: Stefan Matheis (steffkes)
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 3.3, 4.0
>
> Attachments: 110303_FieldAnalysisRequestHandler_output.xml, 
> 110303_FieldAnalysisRequestHandler_view.png, SOLR-2400-revision1.patch, 
> SOLR-2400-revision1.patch, SOLR-2400.patch, SOLR-2400.patch, SOLR-2400.patch, 
> SOLR-2400.patch, field.xml
>
>
> The XML-Output (simplified example attached) is missing one small information 
> .. which could be very useful to build an nice Analysis-Output, and that's 
> "Token-Relation" (if there is special/correct word for this, please correct 
> me).
> Meaning, that is actually not possible to "follow" the Analysis-Process 
> (completly) while the Tokenizers/Filters will drop out Tokens (f.e. StopWord) 
> or split it into multiple Tokens (f.e. WordDelimiter).
> Would it be possible to include this Information? If so, it would be possible 
> to create an improved Analysis-Page for the new Solr Admin (SOLR-2399) - 
> short scribble attached

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans

2011-06-13 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-2878:


Attachment: LUCENE-2878_trunk.patch

here is a patch that applies to trunk. I added a simple maybe slowish 
PositionTermScorer that is used when pos are required. This is really work in 
progress but I am uploading it just in case somebody is interested.

> Allow Scorer to expose positions and payloads aka. nuke spans 
> --
>
> Key: LUCENE-2878
> URL: https://issues.apache.org/jira/browse/LUCENE-2878
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: Bulk Postings branch
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>  Labels: gsoc2011, lucene-gsoc-11, mentor
> Attachments: LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, 
> LUCENE-2878.patch, LUCENE-2878_trunk.patch
>
>
> Currently we have two somewhat separate types of queries, the one which can 
> make use of positions (mainly spans) and payloads (spans). Yet Span*Query 
> doesn't really do scoring comparable to what other queries do and at the end 
> of the day they are duplicating lot of code all over lucene. Span*Queries are 
> also limited to other Span*Query instances such that you can not use a 
> TermQuery or a BooleanQuery with SpanNear or anthing like that. 
> Beside of the Span*Query limitation other queries lacking a quiet interesting 
> feature since they can not score based on term proximity since scores doesn't 
> expose any positional information. All those problems bugged me for a while 
> now so I stared working on that using the bulkpostings API. I would have done 
> that first cut on trunk but TermScorer is working on BlockReader that do not 
> expose positions while the one in this branch does. I started adding a new 
> Positions class which users can pull from a scorer, to prevent unnecessary 
> positions enums I added ScorerContext#needsPositions and eventually 
> Scorere#needsPayloads to create the corresponding enum on demand. Yet, 
> currently only TermQuery / TermScorer implements this API and other simply 
> return null instead. 
> To show that the API really works and our BulkPostings work fine too with 
> positions I cut over TermSpanQuery to use a TermScorer under the hood and 
> nuked TermSpans entirely. A nice sideeffect of this was that the Position 
> BulkReading implementation got some exercise which now :) work all with 
> positions while Payloads for bulkreading are kind of experimental in the 
> patch and those only work with Standard codec. 
> So all spans now work on top of TermScorer ( I truly hate spans since today ) 
> including the ones that need Payloads (StandardCodec ONLY)!!  I didn't bother 
> to implement the other codecs yet since I want to get feedback on the API and 
> on this first cut before I go one with it. I will upload the corresponding 
> patch in a minute. 
> I also had to cut over SpanQuery.getSpans(IR) to 
> SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk 
> first but after that pain today I need a break first :).
> The patch passes all core tests 
> (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't 
> look into the MemoryIndex BulkPostings API yet)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1

2011-06-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048616#comment-13048616
 ] 

Michael McCandless commented on LUCENE-3196:


Ahh yes great!  selckin's random number generator should hit 1 frequently ;)

> Optimize FixedStraightBytes for bytes size == 1
> ---
>
> Key: LUCENE-3196
> URL: https://issues.apache.org/jira/browse/LUCENE-3196
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3196.patch
>
>
> Currently we read all the bytes in a PagedBytes instance wich is unneeded for 
> single byte values like norms. For fast access this should simply be a 
> straight array.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1

2011-06-13 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048615#comment-13048615
 ] 

Simon Willnauer commented on LUCENE-3196:
-

bq. Do we have a test (eg a random test that picks random fixed byte[] size) 
that covers this...?
yes the fixed length is selected at random in the tests I fixed that in the 
patch too.

> Optimize FixedStraightBytes for bytes size == 1
> ---
>
> Key: LUCENE-3196
> URL: https://issues.apache.org/jira/browse/LUCENE-3196
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3196.patch
>
>
> Currently we read all the bytes in a PagedBytes instance wich is unneeded for 
> single byte values like norms. For fast access this should simply be a 
> straight array.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1

2011-06-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048606#comment-13048606
 ] 

Michael McCandless commented on LUCENE-3196:


Looks good Simon!

Probably other smallish sizes (2, 3, 4, ...) could be a single array too, ie 
paged or not should be separately controllable, but we can do that later; this 
is a great baby step since we need this for norms cutover.

Do we have a test (eg a random test that picks random fixed byte[] size) that 
covers this...?

> Optimize FixedStraightBytes for bytes size == 1
> ---
>
> Key: LUCENE-3196
> URL: https://issues.apache.org/jira/browse/LUCENE-3196
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: core/index, core/search
>Affects Versions: 4.0
>Reporter: Simon Willnauer
>Assignee: Simon Willnauer
>Priority: Minor
> Fix For: 4.0
>
> Attachments: LUCENE-3196.patch
>
>
> Currently we read all the bytes in a PagedBytes instance wich is unneeded for 
> single byte values like norms. For fast access this should simply be a 
> straight array.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3193) TwoPhaseCommit interface

2011-06-13 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048605#comment-13048605
 ] 

Michael McCandless commented on LUCENE-3193:


Looks great Shai!

> TwoPhaseCommit interface
> 
>
> Key: LUCENE-3193
> URL: https://issues.apache.org/jira/browse/LUCENE-3193
> Project: Lucene - Java
>  Issue Type: New Feature
>  Components: core/index
>Reporter: Shai Erera
>Assignee: Shai Erera
> Fix For: 3.3, 4.0
>
> Attachments: LUCENE-3193.patch, LUCENE-3193.patch
>
>
> I would like to propose a TwoPhaseCommit interface which declares the methods 
> necessary to implement a 2-phase commit algorithm:
> * prepareCommit()
> * commit()
> * rollback()
> The prepare/commit ones have variants that take a (Map 
> commitData) following the ones we have in IndexWriter.
> In addition, a TwoPhaseCommitTool which implements a 2-phase commit amongst 
> several TPCs.
> Having IndexWriter implement that interface will allow running the 2-phase 
> commit algorithm on multiple IWs or IW + any other object that implements the 
> interface.
> We should mark the interface @lucene.internal so as to not block ourselves in 
> the future. This is pretty advanced stuff anyway.
> Will post a patch soon

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Koji Sekiguchi


Welcome!

(11/06/13 23:43), Mark Miller wrote:

I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as our 
newest committer.

Jan, if you don't mind, could you introduce yourself with a brief bio as has 
become our tradition?

Congratulations and welcome aboard!


- Mark Miller
lucidimagination.com









-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org





--
http://www.rondhuit.com/en/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Michael McCandless

Welcome!

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 13, 2011 at 10:43 AM, Mark Miller  wrote:
> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as 
> our newest committer.
>
> Jan, if you don't mind, could you introduce yourself with a brief bio as has 
> become our tradition?
>
> Congratulations and welcome aboard!
>
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Jan Høydahl as Lucene/Solr committer

2011-06-13 Thread Simon Willnauer

Welcome!

On Mon, Jun 13, 2011 at 5:04 PM, Robert Muir  wrote:
> Welcome Jan!
>
> On Mon, Jun 13, 2011 at 10:43 AM, Mark Miller  wrote:
>> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as 
>> our newest committer.
>>
>> Jan, if you don't mind, could you introduce yourself with a brief bio as has 
>> become our tradition?
>>
>> Congratulations and welcome aboard!
>>
>>
>> - Mark Miller
>> lucidimagination.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 135 matches

Mail list logo