[HUDSON] Solr-3.x - Build # 313 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Solr-3.x/313/

All tests passed

Build Log (for compile errors):
[...truncated 23887 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser

2011-04-02 Thread Adriano Crestani (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015099#comment-13015099
 ] 

Adriano Crestani commented on LUCENE-2979:
--

Hi Phillipe,

Good proposal, very detailed.

Looking at the schedule table you created, it sounds like now the project is 
small for a two and half month project. However, I am probably underestimating 
the difficulty of the project, mainly because I am already used to the code. So 
you probably shouldn't worry about it ;)

> Simplify configuration API of contrib Query Parser
> --
>
> Key: LUCENE-2979
> URL: https://issues.apache.org/jira/browse/LUCENE-2979
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Affects Versions: 2.9, 3.0
>Reporter: Adriano Crestani
>Assignee: Adriano Crestani
>  Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor
> Fix For: 3.2, 4.0
>
>
> The current configuration API is very complicated and inherit the concept 
> used by Attribute API to store token information in token streams. However, 
> the requirements for both (QP config and token stream) are not the same, so 
> they shouldn't be using the same thing.
> I propose to simplify QP config and make it less scary for people intending 
> to use contrib QP. The task is not difficult, it will just require a lot of 
> code change and figure out the best way to do it. That's why it's a good 
> candidate for a GSoC project.
> I would like to hear good proposals about how to make the API more friendly 
> and less scaring :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-trunk - Build # 1518 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1518/

All tests passed

Build Log (for compile errors):
[...truncated 18647 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2394) Deprecate "standard" writer type (wt), just say "xml".

2011-04-02 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015096#comment-13015096
 ] 

David Smiley commented on SOLR-2394:


I'm glad you are/were totally in favor of this Hoss because, yes, I was simply 
talking about putting "xml" as the value in the web UI instead of "standard".  
I could have sworn this was going to get into 3.1 but it did not :-(  Can 
someone commit this trivial change so it is not forgotten in the next release?

> Deprecate "standard" writer type (wt), just say "xml".
> --
>
> Key: SOLR-2394
> URL: https://issues.apache.org/jira/browse/SOLR-2394
> Project: Solr
>  Issue Type: Improvement
>  Components: web gui
>Affects Versions: 1.4.1
>Reporter: David Smiley
>Priority: Minor
>
> I think the "standard" writer type being aliased to "xml" is unnecessary and 
> not clear.  In the full interface screen, imagine you're new to Solr and you 
> look at the "output type" and you see "standard".  What does that mean?  
> "xml" is pretty clear.
> Assuming it's agreed that "standard" isn't of any value, I suggest that it 
> not necessarily go away (due to backwards compatibility) but that it's 
> existence is removed from the full interface screen, replaced with "xml".  
> Furthermore, a picker list would be a better GUI element since there are a 
> small limited fixed number of them.
> Perhaps this small UI change can be considered for 3.1.  I'll write a small 
> patch if there's agreement on the idea.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-3.x - Build # 333 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-3.x/333/

All tests passed

Build Log (for compile errors):
[...truncated 22449 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-2366) Facet Range Gaps

2011-04-02 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015088#comment-13015088
 ] 

Hoss Man edited comment on SOLR-2366 at 4/2/11 11:48 PM:
-

In no particular order...

* I like Jan's {{facet.range.spec}} naming suggestion better then my 
{{facet.range.buckets}} suggestion ... but i think {{facet.range.series}}, 
{{facet.range.seq}}, or {{facet.range.sequence}} might be better still.

* I think Jan's point about {{N}} vs {{+N}} in the sequence list as a way to 
mix absolute values vs increments definitely makes sense, and would be 
consistent with the existing date match expression.  

* the complexity with supporting *both* absolute values and increments would be 
the question of what solr should do with input like 
{{facet.range.seq=10,20,+50,+100,120,150}} ?  what ranges would we return? 
(10-20, 20-70, 70-???)  would it be an error? would we give back ranges 
that overlapped?  what about 
{{facet.range.seq=10,50,+50,100,150&facet.range.include=all}} .. would that 
result in one of the ranges being [100 TO 100] or would we throw that one out?  
(I think it would be wise to start out only implementing the absolute value 
approavh, since that seems (to me) the more useful option of the two, and then 
consider adding the incremental values as a separate issue later after hashing 
out hte semantics of these types of situations)

* A few of Jan's sample input suggestions used {{ * }} at either the start or 
end of the sequence to denote "everything before" the second value or 
"everything after" the second to last value -- i don't think we need to support 
this syntax, I think the existing {{facet.range.other}} would still be the 
right way to support this with {{facet.range.sequence}}.  if you want 
"everything before" and/or "everything after" use 
{{facet.range.include=before}} and/or {{facet.range.include=after}} .. 
otherwise it would be confusing to decide what things like 
{{facet.range.include=before&facet.range.seq=*,10,20}} and 
{{facet.range.include=none&facet.range.seq= * ,10,20}} mean.

* I *REALLY* don't think we should try to implement something like Jan's 
{{facet.range.labels}} suggestion.  I can't imagine any way of supporting it 
thta wouldn't prevent or radically complicate the "..." type continuation of 
series i suggested before, and that seems like a much more powerful feature 
then labels.  if a user is going to provide a label for every range, then you 
must enumerate every range, and you might as well enumerate them (and label 
them) with {{facet.query}} where the label and the query can be side by side.

This...

{code}
facet.query={!label="One or more"}bedrooms:[1 TO *]
facet.query={!label="Two or more"}bedrooms:[2 TO *]
facet.query={!label="Three or more"}bedrooms:[3 TO *]
facet.query={!label="Four or more"}bedrooms:[4 TO *]
{code}

...seems way more readable, and less prone to user error in tweaking, then 
this...

{code}
f.bedrooms.facet.range.spec=1..*,2..*,3..*,4..*
f.bedrooms.facet.range.labels="One or more","Two or more","Three or more","Four 
or more"
{code}

* Herman commented...

bq. While using fact.query allows us to construct arbitrary ranges, we must 
then pick them out of the results separately. This becomes more difficult if we 
arbitrarily facet on two or more fields/expressions. 

I don't see that as being particularly hard problem that we need to worry about 
helping users avoid,  Especially since users can anotate those queries using 
localparams and set any arbitrary key=val pairs on them that you want to help 
organize them and identify them later when parsing the response...

{code}
facet.query={!group=bed label="One or more"}bedrooms:[1 TO *]
facet.query={!group=bed label="Two or more"}bedrooms:[2 TO *]
facet.query={!group=bed label="Three or more"}bedrooms:[3 TO *]
facet.query={!group=bed label="Four or more"}bedrooms:[4 TO *]
facet.query={!group=size label="Small"}sqft:[* TO 1000]
facet.query={!group=size label="Medium"}sqft:[1000 TO 2500]
facet.query={!group=size label="Large"}sqft:[2500 TO *]
{code}




  was (Author: hossman):
In no particular order...

* I like Jan's {{facet.range.spec}} naming suggestion better then my 
{{facet.range.buckets}} suggestion ... but i think {{facet.range.series}}, 
{{facet.range.seq}}, or {{facet.range.sequence}} might be better still.

* I think Jan's point about {{N}} vs {{+N}} in the sequence list as a way to 
mix absolute values vs increments definitely makes sense, and would be 
consistent with the existing date match expression.  

* the complexity with supporting *both* absolute values and increments would be 
the question of what solr should do with input like 
{{facet.range.seq=10,20,+50,+100,120,150}} ?  what ranges would we return? 
(10-20, 20-70, 70-???)  would it be an error? would we give back ranges 
that overlapped?  what about 
{{facet.range.s

[jira] [Commented] (SOLR-2366) Facet Range Gaps

2011-04-02 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015088#comment-13015088
 ] 

Hoss Man commented on SOLR-2366:


In no particular order...

* I like Jan's {{facet.range.spec}} naming suggestion better then my 
{{facet.range.buckets}} suggestion ... but i think {{facet.range.series}}, 
{{facet.range.seq}}, or {{facet.range.sequence}} might be better still.

* I think Jan's point about {{N}} vs {{+N}} in the sequence list as a way to 
mix absolute values vs increments definitely makes sense, and would be 
consistent with the existing date match expression.  

* the complexity with supporting *both* absolute values and increments would be 
the question of what solr should do with input like 
{{facet.range.seq=10,20,+50,+100,120,150}} ?  what ranges would we return? 
(10-20, 20-70, 70-???)  would it be an error? would we give back ranges 
that overlapped?  what about 
{{facet.range.seq=10,50,+50,100,150&facet.range.include=all}} .. would that 
result in one of the ranges being [100 TO 100] or would we throw that one out?  
(I think it would be wise to start out only implementing the absolute value 
approavh, since that seems (to me) the more useful option of the two, and then 
consider adding the incremental values as a separate issue later after hashing 
out hte semantics of these types of situations)

* A few of Jan's sample input suggestions used {{*}} at either the start or end 
of the sequence to denote "everything before" the second value or "everything 
after" the second to last value -- i don't think we need to support this 
syntax, I think the existing {{facet.range.other}} would still be the right way 
to support this with {{facet.range.sequence}}.  if you want "everything before" 
and/or "everything after" use {{facet.range.include=before}} and/or 
{{facet.range.include=after}} .. otherwise it would be confusing to decide what 
things like {{facet.range.include=before&facet.range.seq=*,10,20}} and 
{{facet.range.include=none&facet.range.seq=*,10,20}} mean.

* I *REALLY* don't think we should try to implement something like Jan's 
{{facet.range.labels}} suggestion.  I can't imagine any way of supporting it 
thta wouldn't prevent or radically complicate the "..." type continuation of 
series i suggested before, and that seems like a much more powerful feature 
then labels.  if a user is going to provide a label for every range, then you 
must enumerate every range, and you might as well enumerate them (and label 
them) with {{facet.query}} where the label and the query can be side by side.

This...

{code}
facet.query={!label="One or more"}bedrooms:[1 TO *]
facet.query={!label="Two or more"}bedrooms:[2 TO *]
facet.query={!label="Three or more"}bedrooms:[3 TO *]
facet.query={!label="Four or more"}bedrooms:[4 TO *]
{code}

...seems way more readable, and less prone to user error in tweaking, then 
this...

{code}
f.bedrooms.facet.range.spec=1..*,2..*,3..*,4..*
f.bedrooms.facet.range.labels="One or more","Two or more","Three or more","Four 
or more"
{code}

* Herman commented...

bq. While using fact.query allows us to construct arbitrary ranges, we must 
then pick them out of the results separately. This becomes more difficult if we 
arbitrarily facet on two or more fields/expressions. 

I don't see that as being particularly hard problem that we need to worry about 
helping users avoid,  Especially since users can anotate those queries using 
localparams and set any arbitrary key=val pairs on them that you want to help 
organize them and identify them later when parsing the response...

{code}
facet.query={!group=bed label="One or more"}bedrooms:[1 TO *]
facet.query={!group=bed label="Two or more"}bedrooms:[2 TO *]
facet.query={!group=bed label="Three or more"}bedrooms:[3 TO *]
facet.query={!group=bed label="Four or more"}bedrooms:[4 TO *]
facet.query={!group=size label="Small"}sqft:[* TO 1000]
facet.query={!group=size label="Medium"}sqft:[1000 TO 2500]
facet.query={!group=size label="Large"}sqft:[2500 TO *]
{code}




> Facet Range Gaps
> 
>
> Key: SOLR-2366
> URL: https://issues.apache.org/jira/browse/SOLR-2366
> Project: Solr
>  Issue Type: Improvement
>Reporter: Grant Ingersoll
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: SOLR-2366.patch, SOLR-2366.patch
>
>
> There really is no reason why the range gap for date and numeric faceting 
> needs to be evenly spaced.  For instance, if and when SOLR-1581 is completed 
> and one were doing spatial distance calculations, one could facet by function 
> into 3 different sized buckets: walking distance (0-5KM), driving distance 
> (5KM-150KM) and everything else (150KM+), for instance.  We should be able to 
> quantize the results into arbitrarily sized buckets.  I'd propose the syntax 
> to be a comma separate

Re: Unsupported encoding GB18030

2011-04-02 Thread Chris Hostetter

: I don't see the reason why "exampledocs" should contain docs with narrow 
charsets not guaranteed to be supported.
: In my opinion this file belongs in the test suite, also since it only 
contains "test" content, unsuitable for demoing.

it's purpose for being there is to let *users* "test" if their servlet 
container + solr conbination is working with alternate encodings -- much 
the same reason utf8-example.xml and test_utf8.sh are included in 
exampledocs.

It's a perfectly valid exampledoc for Solr.  it may not work on all 
platforms, but the *.sh files aren't garunteed to work on all platforms 
either.  if we moved it to hte test directory, end users fetching binary 
releases wouldn't get it. and may not be aware that their servlet 
container isn't supporting thta charset.

personally i would like to see us add a lot more exampledocs in a lot more 
esoteric encodings, precisely to help end users sanity test this sort of 
thing.  we frequetnly get questions form people about character encoding 
wonkiness, and things like test_utf8.sh, utf8-example.xml, and now 
gb18030-example.xml can help us narrow down the problem: their 
client code, their servlet container, or solr?


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6654 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6654/

No tests ran.

Build Log (for compile errors):
[...truncated 25 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6642 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6642/

No tests ran.

Build Log (for compile errors):
[...truncated 25 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6653 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6653/

No tests ran.

Build Log (for compile errors):
[...truncated 25 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6641 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6641/

No tests ran.

Build Log (for compile errors):
[...truncated 31 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Unsupported encoding GB18030

2011-04-02 Thread Jan Høydahl
My XP is a VMWare instance. SP3 with Oracle's standard Java. I upgraded Java to 
Java 1.6.0_24 but that did not fix it.
Then I installed support for "East Asian languages" and "right to left" in 
Control Panel, rebooted and tried again. No luck.
Then I installed GB18030 Support Package from 
http://go.microsoft.com/fwlink/?LinkID=26235. No luck.

I don't personally have this issue since I don't run Windows, it was a test I 
did to validate that things work under Windows.

I don't see the reason why "exampledocs" should contain docs with narrow 
charsets not guaranteed to be supported.
In my opinion this file belongs in the test suite, also since it only contains 
"test" content, unsuitable for demoing.

+1 to remove gb18030-example.xml from exampledocs. Not sure if it should be 
moved to a unit test.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 1. apr. 2011, at 17.16, Yonik Seeley wrote:

> Being practical, it's all about "If this is likely to fail for enough
> users", as I said in my previous post.
> I don't really know the answer to that at this point.
> 
> -Yonik
> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
> 25-26, San Francisco
> 
> 
> On Fri, Apr 1, 2011 at 11:12 AM, Uwe Schindler  wrote:
>> Hi Yonik,
>> 
>> I started my virtual box with fresh windows xp snapshot. Downloaded JDK
>> 1.6.0_24 and Solr 3.1.0. Started solr and then "java -jar post.jar *.xml" ->
>> success.
>> 
>> You should before we start to "fix" something that's not an issue ask this
>> person which JDK exactly he uses and where he downloaded it. Is it maybe not
>> an Oracle one? (this GB encoding is very common - if a JVM does not support
>> it (it must not) it can only be some western-european one like I mentioned
>> in my mail).
>> 
>> Uwe
>> 
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>> 
>> 
>>> -Original Message-
>>> From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
>>> Seeley
>>> Sent: Friday, April 01, 2011 4:21 PM
>>> To: dev@lucene.apache.org
>>> Cc: Robert Muir
>>> Subject: Re: Unsupported encoding GB18030
>>> 
>>> On Fri, Apr 1, 2011 at 10:07 AM, Robert Muir  wrote:
 On Fri, Apr 1, 2011 at 10:00 AM, Yonik Seeley
  wrote:
> On Fri, Apr 1, 2011 at 9:22 AM, Jan Høydahl 
>>> wrote:
>> Testing the new Solr 3.1 release under Windows XP and Java 1.6.0_23
>> 
>> When trying to post example\exampledocs\gb18030-example.xml using
>>> post.jar I get this error:
>> % java -jar post.jar gb18030-example.xml jar gb18030-example.xml
>> SimplePostTool: version 1.3
>> SimplePostTool: POSTing files to http://localhost:8983/solr/update..
>> SimplePostTool: POSTing file gb18030-example.xml
>> SimplePostTool: FATAL: Solr returned an error #400 Unsupported
>> encoding: GB18030lap
>> 
>> From the stack it is caused by com.ctc.wstx.exc.WstxIOException:
>> Unsupported encoding: GB18030
>> 
>> The same works on my MacBook with Java1.6.0_24
> 
> Interesting - things seem fine for me on Win7 Java1.6.0_24, but I
> don't have XP around any longer to see if that's the factor somehow.
> 
 
 Its worth mentioning, there is no guarantee the JRE will support
 GB18030 encoding.
 
 There are only 6 charsets guaranteed to exist:
 http://download.oracle.com/javase/6/docs/api/java/nio/charset/Charset.
 html
>>> 
>>> Indexing *.xml is a very common thing for new users to do.
>>> If this is likely to fail for enough users, we should move, remove, or at
>> least
>>> change the filename to something like gb18030-example.xml.gb18030 so it
>>> won't get picked up by accident.
>>> 
>>> -Yonik
>>> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-
>>> 26, San Francisco
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>>> commands, e-mail: dev-h...@lucene.apache.org
>> 
>> 
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2450) Carrot2 clustering should use both its own and Solr's stop words

2011-04-02 Thread Stanislaw Osinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislaw Osinski updated SOLR-2450:


Attachment: SOLR-2450.patch

Patch for the use of stop words from the field's {{StopWordFilterFactory}} and 
{{CommonGramsFilterFactory}} in addition to Carrot2's built-in stop words.

Requires the SOLR-2448 and SOLR-2449 patches applied. 

> Carrot2 clustering should use both its own and Solr's stop words
> 
>
> Key: SOLR-2450
> URL: https://issues.apache.org/jira/browse/SOLR-2450
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Clustering
>Reporter: Stanislaw Osinski
>Assignee: Stanislaw Osinski
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: SOLR-2450.patch
>
>
> While using only Solr's stop words for clustering isn't a good idea (compared 
> to indexing, clustering needs more aggressive stop word removal to get 
> reasonable cluster labels), it would be good if Carrot2 used both its own and 
> Solr's stop words.
> I'm not sure what the best way to implement this would be though. My first 
> thought was to simply load {{stopwords.txt}} from Solr config dir and merge 
> them with Carrot2's. But then, maybe a better approach would be to get the 
> stop words from the StopFilter being used? Ideally, we should also consider 
> the per-field stop filters configured on the fields used for clustering.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2449) Loading of Carrot2 resources from Solr config directory

2011-04-02 Thread Stanislaw Osinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislaw Osinski updated SOLR-2449:


Attachment: SOLR-2449.patch

The patch requires the SOLR-2448 patch applied.

> Loading of Carrot2 resources from Solr config directory
> ---
>
> Key: SOLR-2449
> URL: https://issues.apache.org/jira/browse/SOLR-2449
> Project: Solr
>  Issue Type: Improvement
>  Components: contrib - Clustering
>Reporter: Stanislaw Osinski
>Assignee: Stanislaw Osinski
> Fix For: 3.2, 4.0
>
> Attachments: SOLR-2449.patch
>
>
> Currently, Carrot2 clustering algorithms read linguistic resources (stop 
> words, stop labels) from the classpath (Carrot2 JAR), which makes them 
> difficult to edit/override. The directory from which Carrot2 should read its 
> resources (absolute, or relative to Solr config dir) could be specified in 
> the {{engine}} element. By default, the path could be e.g. 
> {{/clustering/carrot2}}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2448) Upgrade Carrot2 to version 3.5.0

2011-04-02 Thread Stanislaw Osinski (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislaw Osinski updated SOLR-2448:


Attachment: SOLR-2448.zip

Initial patch (git) based on Carrot2 3.5.0-dev, against Solr trunk. As soon as 
we make the stable 3.5.0 release, I'll submit the final patch for your review.

> Upgrade Carrot2 to version 3.5.0
> 
>
> Key: SOLR-2448
> URL: https://issues.apache.org/jira/browse/SOLR-2448
> Project: Solr
>  Issue Type: Task
>  Components: contrib - Clustering
>Reporter: Stanislaw Osinski
>Assignee: Stanislaw Osinski
>Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: SOLR-2448.zip
>
>
> Carrot2 version 3.5.0 should be available very soon. After the upgrade, it 
> will be possible to implement a few improvements to the clustering plugin; 
> I'll file separate issues for these.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6650 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6650/

No tests ran.

Build Log (for compile errors):
[...truncated 12 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6638 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6638/

No tests ran.

Build Log (for compile errors):
[...truncated 12 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6649 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6649/

No tests ran.

Build Log (for compile errors):
[...truncated 53 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

2011-04-02 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015053#comment-13015053
 ] 

Ryan McKinley commented on SOLR-2378:
-

bq. is there any info on threading of Solr components? I am in particular 
looking for mutable object fields in the suggester (can a single suggester 
instance be accessed by multiple threads at the same time)?

Components are initalized at startup and the same instance is used for every 
request (multi-threaded)

If you need to use the same obejects in prepare and process, you can either put 
then in the request context map, or add something to ResponseBuilder (perhaps 
better if this is widely used)

> FST-based Lookup (suggestions) for prefix matches.
> --
>
> Key: SOLR-2378
> URL: https://issues.apache.org/jira/browse/SOLR-2378
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>  Labels: lookup, prefix
> Fix For: 4.0
>
>
> Implement a subclass of Lookup based on finite state automata/ transducers 
> (Lucene FST package). This issue is for implementing a relatively basic 
> prefix matcher, we will handle infixes and other types of input matches 
> gradually. Impl. phases:
> - write a DFA based suggester effectively identical to ternary tree based 
> solution right now,
> - baseline benchmark against tern. tree (memory consumption, rebuilding 
> speed, indexing speed; reuse Andrzej's benchmark code)
> - modify DFA to encode term weights directly in the automaton (optimize for 
> onlyMostPopular case)
> - benchmark again
> - add infix suggestion support with prefix matches boosted higher (?)
> - benchmark again
> - modify the tutorial on the wiki [http://wiki.apache.org/solr/Suggester]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6637 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6637/

No tests ran.

Build Log (for compile errors):
[...truncated 12 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6648 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6648/

No tests ran.

Build Log (for compile errors):
[...truncated 25 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6636 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6636/

No tests ran.

Build Log (for compile errors):
[...truncated 12563 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: svn commit: r1087649 - /lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterTest.java

2011-04-02 Thread Steven A Rowe
Maven build fix committed to trunk: r1088065.

> -Original Message-
> From: Steven A Rowe [mailto:sar...@syr.edu]
> Sent: Saturday, April 02, 2011 11:04 AM
> To: dev@lucene.apache.org
> Subject: RE: svn commit: r1087649 -
> /lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/Suggester
> Test.java
> 
> FYI, this commit introduced a new Solr Core test dependency on Google
> Collections (solr/lib/guava-r05.jar).  Previously, AFAICT, only the
> Clustering contrib had this dependency.
> 
> I noticed because the Maven trunk build failed.  I'm working on addressing
> this now.
> 
> Steve
> 
> > -Original Message-
> > From: dwe...@apache.org [mailto:dwe...@apache.org]
> > Sent: Friday, April 01, 2011 7:13 AM
> > To: comm...@lucene.apache.org
> > Subject: svn commit: r1087649 -
> >
> /lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/Suggester
> > Test.java
> >
> > Author: dweiss
> > Date: Fri Apr  1 11:13:10 2011
> > New Revision: 1087649
> >
> > URL: http://svn.apache.org/viewvc?rev=1087649&view=rev
> > Log:
> > SOLR-2378: Cleaning up the benchmark code a little, committing right in
> > without the patch.
> >
> > Modified:
> >
> >
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> > est.java
> >
> > Modified:
> >
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> > est.java
> > URL:
> >
> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/src/test/org/apache/sol
> >
> r/spelling/suggest/SuggesterTest.java?rev=1087649&r1=1087648&r2=1087649&vi
> > ew=diff
> >
> ==
> > 
> > ---
> >
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> > est.java (original)
> > +++
> >
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> > est.java Fri Apr  1 11:13:10 2011
> > @@ -27,9 +27,13 @@ import org.apache.solr.util.TermFreqIter
> >  import org.junit.BeforeClass;
> >  import org.junit.Test;
> >
> > +import com.google.common.collect.Lists;
> > +
> >  import java.io.File;
> > +import java.util.Arrays;
> >  import java.util.HashMap;
> >  import java.util.List;
> > +import java.util.Locale;
> >  import java.util.Map;
> >  import java.util.Random;
> >
> > @@ -130,10 +134,53 @@ public class SuggesterTest extends SolrT
> >  return tfit;
> >}
> >
> > -  private void _benchmark(Lookup lookup, Map ref,
> boolean
> > estimate, Bench bench) throws Exception {
> > +  static class Bench {
> > +long buildTime;
> > +long lookupTime;
> > +  }
> > +
> > +  @Test
> > +  public void testBenchmark() throws Exception {
> > +// this benchmark is very time consuming
> > +boolean doTest = true;
> > +if (!doTest) {
> > +  return;
> > +}
> > +
> > +final List> benchmarkClasses =
> > Lists.newArrayList();
> > +benchmarkClasses.add(JaspellLookup.class);
> > +benchmarkClasses.add(TSTLookup.class);
> > +
> > +// Run a single pass just to see if everything works fine and
> provide
> > size estimates.
> > +final RamUsageEstimator rue = new RamUsageEstimator();
> > +for (Class cls : benchmarkClasses) {
> > +  Lookup lookup = singleBenchmark(cls, null);
> > +  System.err.println(
> > +  String.format(Locale.ENGLISH,
> > +  "%20s, size[B]=%,d",
> > +  lookup.getClass().getSimpleName(),
> > +  rue.estimateRamUsage(lookup)));
> > +}
> > +
> > +int warmupCount = 10;
> > +int measuredCount = 100;
> > +for (Class cls : benchmarkClasses) {
> > +  Bench b = fullBenchmark(cls, warmupCount, measuredCount);
> > +  System.err.println(String.format(Locale.ENGLISH,
> > +  "%s: buildTime[ms]=%,d lookupTime[ms]=%,d",
> > +  cls.getSimpleName(),
> > +  (b.buildTime / measuredCount),
> > +  (b.lookupTime / measuredCount / 100)));
> > +}
> > +  }
> > +
> > +  private Lookup singleBenchmark(Class cls, Bench
> > bench) throws Exception {
> > +Lookup lookup = cls.newInstance();
> > +
> >  long start = System.currentTimeMillis();
> >  lookup.build(getTFIT());
> >  long buildTime = System.currentTimeMillis() - start;
> > +
> >  TermFreqIterator tfit = getTFIT();
> >  long elapsed = 0;
> >  while (tfit.hasNext()) {
> > @@ -148,78 +195,37 @@ public class SuggesterTest extends SolrT
> >for (LookupResult lr : res) {
> >  assertTrue(lr.key.startsWith(prefix));
> >}
> > -  if (ref != null) { // verify the counts
> > -Integer Cnt = ref.get(key);
> > -if (Cnt == null) { // first pass
> > -  ref.put(key, res.size());
> > -} else {
> > -  assertEquals(key + ", prefix: " + prefix, Cnt.intValue(),
> > res.size());
> > -}
> > -  }
> > -}
> > -if (estimate) {
> > -  RamUsageEstimator rue = new RamUsageEstimator();
> > -  long size = rue.estimateRamUsage(lookup);
> > -  System.er

RE: svn commit: r1087649 - /lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterTest.java

2011-04-02 Thread Steven A Rowe
FYI, this commit introduced a new Solr Core test dependency on Google 
Collections (solr/lib/guava-r05.jar).  Previously, AFAICT, only the Clustering 
contrib had this dependency.

I noticed because the Maven trunk build failed.  I'm working on addressing this 
now.

Steve

> -Original Message-
> From: dwe...@apache.org [mailto:dwe...@apache.org]
> Sent: Friday, April 01, 2011 7:13 AM
> To: comm...@lucene.apache.org
> Subject: svn commit: r1087649 -
> /lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/Suggester
> Test.java
> 
> Author: dweiss
> Date: Fri Apr  1 11:13:10 2011
> New Revision: 1087649
> 
> URL: http://svn.apache.org/viewvc?rev=1087649&view=rev
> Log:
> SOLR-2378: Cleaning up the benchmark code a little, committing right in
> without the patch.
> 
> Modified:
> 
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> est.java
> 
> Modified:
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> est.java
> URL:
> http://svn.apache.org/viewvc/lucene/dev/trunk/solr/src/test/org/apache/sol
> r/spelling/suggest/SuggesterTest.java?rev=1087649&r1=1087648&r2=1087649&vi
> ew=diff
> ==
> 
> ---
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> est.java (original)
> +++
> lucene/dev/trunk/solr/src/test/org/apache/solr/spelling/suggest/SuggesterT
> est.java Fri Apr  1 11:13:10 2011
> @@ -27,9 +27,13 @@ import org.apache.solr.util.TermFreqIter
>  import org.junit.BeforeClass;
>  import org.junit.Test;
> 
> +import com.google.common.collect.Lists;
> +
>  import java.io.File;
> +import java.util.Arrays;
>  import java.util.HashMap;
>  import java.util.List;
> +import java.util.Locale;
>  import java.util.Map;
>  import java.util.Random;
> 
> @@ -130,10 +134,53 @@ public class SuggesterTest extends SolrT
>  return tfit;
>}
> 
> -  private void _benchmark(Lookup lookup, Map ref, boolean
> estimate, Bench bench) throws Exception {
> +  static class Bench {
> +long buildTime;
> +long lookupTime;
> +  }
> +
> +  @Test
> +  public void testBenchmark() throws Exception {
> +// this benchmark is very time consuming
> +boolean doTest = true;
> +if (!doTest) {
> +  return;
> +}
> +
> +final List> benchmarkClasses =
> Lists.newArrayList();
> +benchmarkClasses.add(JaspellLookup.class);
> +benchmarkClasses.add(TSTLookup.class);
> +
> +// Run a single pass just to see if everything works fine and provide
> size estimates.
> +final RamUsageEstimator rue = new RamUsageEstimator();
> +for (Class cls : benchmarkClasses) {
> +  Lookup lookup = singleBenchmark(cls, null);
> +  System.err.println(
> +  String.format(Locale.ENGLISH,
> +  "%20s, size[B]=%,d",
> +  lookup.getClass().getSimpleName(),
> +  rue.estimateRamUsage(lookup)));
> +}
> +
> +int warmupCount = 10;
> +int measuredCount = 100;
> +for (Class cls : benchmarkClasses) {
> +  Bench b = fullBenchmark(cls, warmupCount, measuredCount);
> +  System.err.println(String.format(Locale.ENGLISH,
> +  "%s: buildTime[ms]=%,d lookupTime[ms]=%,d",
> +  cls.getSimpleName(),
> +  (b.buildTime / measuredCount),
> +  (b.lookupTime / measuredCount / 100)));
> +}
> +  }
> +
> +  private Lookup singleBenchmark(Class cls, Bench
> bench) throws Exception {
> +Lookup lookup = cls.newInstance();
> +
>  long start = System.currentTimeMillis();
>  lookup.build(getTFIT());
>  long buildTime = System.currentTimeMillis() - start;
> +
>  TermFreqIterator tfit = getTFIT();
>  long elapsed = 0;
>  while (tfit.hasNext()) {
> @@ -148,78 +195,37 @@ public class SuggesterTest extends SolrT
>for (LookupResult lr : res) {
>  assertTrue(lr.key.startsWith(prefix));
>}
> -  if (ref != null) { // verify the counts
> -Integer Cnt = ref.get(key);
> -if (Cnt == null) { // first pass
> -  ref.put(key, res.size());
> -} else {
> -  assertEquals(key + ", prefix: " + prefix, Cnt.intValue(),
> res.size());
> -}
> -  }
> -}
> -if (estimate) {
> -  RamUsageEstimator rue = new RamUsageEstimator();
> -  long size = rue.estimateRamUsage(lookup);
> -  System.err.println(lookup.getClass().getSimpleName() + " - size=" +
> size);
>  }
> +
>  if (bench != null) {
>bench.buildTime += buildTime;
>bench.lookupTime +=  elapsed;
>  }
> -  }
> -
> -  class Bench {
> -long buildTime;
> -long lookupTime;
> -  }
> 
> -  @Test
> -  public void testBenchmark() throws Exception {
> -// this benchmark is very time consuming
> -boolean doTest = false;
> -if (!doTest) {
> -  return;
> -}
> -Map ref = new HashMap();
> -JaspellLookup jaspell = new JaspellLookup();
> -TSTLookup tst = new TSTLookup();
> -
> - 

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #77: POMs out of sync

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-trunk/77/

No tests ran.

Build Log (for compile errors):
[...truncated 14627 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-04-02 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015012#comment-13015012
 ] 

Grant Ingersoll commented on SOLR-2155:
---

If the intent is to bring in the "lucene-spatial-playground" into the ASF, why 
not just start a branch?  It will make provenance so much easier.

> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.

2011-04-02 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015011#comment-13015011
 ] 

Dawid Weiss commented on SOLR-2378:
---

Soliciting feedback on the following questions:
- suggesters currently have float weights associated with terms; can these 
floats be bucketed and returned in approximation or do they need to be exact 
copies of the input? For automata, bucketed weights (to, let's say 5-10 
different values) provide terrific speed/size improvements, so if this is not a 
rigid requirement, I'd use them.
- is there any info on threading of Solr components? I am in particular looking 
for mutable object fields in the suggester (can a single suggester instance be 
accessed by multiple threads at the same time)?

I've implemented preliminary FST-based lookup (without weights yet). Speed-wise 
it doesn't rock because data is converted to/from utf8 on input/output and 
sorted during construction, but it is still acceptable, even at this early 
stage I think:

{noformat}
JaspellLookupbuildTime[ms]=112 lookupTime[ms]=288
TSTLookupbuildTime[ms]=115 lookupTime[ms]=103
FSTLookupbuildTime[ms]=464 lookupTime[ms]=145
{noformat}

now... that was speed only, check out the in-memory size :)

{noformat}
JaspellLookupsize[B]=81,078,997
TSTLookupsize[B]=53,453,696
FSTLookupsize[B]=2,909,396
{noformat}

(This benchmark stores very limited vocabulary items -- long numbers only, so 
it is skewed from reality, but it's nice to see something like this, huh?).


> FST-based Lookup (suggestions) for prefix matches.
> --
>
> Key: SOLR-2378
> URL: https://issues.apache.org/jira/browse/SOLR-2378
> Project: Solr
>  Issue Type: New Feature
>  Components: spellchecker
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>  Labels: lookup, prefix
> Fix For: 4.0
>
>
> Implement a subclass of Lookup based on finite state automata/ transducers 
> (Lucene FST package). This issue is for implementing a relatively basic 
> prefix matcher, we will handle infixes and other types of input matches 
> gradually. Impl. phases:
> - write a DFA based suggester effectively identical to ternary tree based 
> solution right now,
> - baseline benchmark against tern. tree (memory consumption, rebuilding 
> speed, indexing speed; reuse Andrzej's benchmark code)
> - modify DFA to encode term weights directly in the automaton (optimize for 
> onlyMostPopular case)
> - benchmark again
> - add infix suggestion support with prefix matches boosted higher (?)
> - benchmark again
> - modify the tutorial on the wiki [http://wiki.apache.org/solr/Suggester]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #80: POMs out of sync

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-3.x/80/

No tests ran.

Build Log (for compile errors):
[...truncated 25 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6636 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6636/

No tests ran.

Build Log (for compile errors):
[...truncated 25 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-3.x - Build # 6624 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/6624/

No tests ran.

Build Log (for compile errors):
[...truncated 12571 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6635 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6635/

1 tests failed.
REGRESSION:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8746 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Lookup (suggester) instances thread safety?

2011-04-02 Thread Dawid Weiss
Are the same Lookup instances used by multiple threads after creation
or are they stored somehow per-thread? This matters because certain
data structures could be statically allocated per Lookup object and
reused, instead of being allocated dynamically. I will dig in the
code, but somebody will probably know right away.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6633 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6633/

1 tests failed.
REGRESSION:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8759 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Solr-trunk - Build # 1462 - Failure

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Solr-trunk/1462/

1 tests failed.
REGRESSION:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8484 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6630 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6630/

1 tests failed.
FAILED:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8746 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[HUDSON] Lucene-Solr-tests-only-trunk - Build # 6629 - Still Failing

2011-04-02 Thread Apache Hudson Server
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6629/

1 tests failed.
FAILED:  org.apache.solr.spelling.suggest.SuggesterTest.testBenchmark

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at java.util.IdentityHashMap.resize(IdentityHashMap.java:469)
at java.util.IdentityHashMap.put(IdentityHashMap.java:445)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:128)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:132)
at 
org.apache.lucene.util.RamUsageEstimator.size(RamUsageEstimator.java:153)
at 
org.apache.lucene.util.RamUsageEstimator.sizeOfArray(RamUsageEstimator.java:178)




Build Log (for compile errors):
[...truncated 8752 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org