[Lucene.Net] Fwd: Travel Assistance applications now open for ApacheCon NA 2011

2011-06-06 Thread Troy Howard
-- Forwarded message --
From: Gavin McDonald ga...@16degrees.com.au
Date: Jun 6, 2011 1:02 AM
Subject: Travel Assistance applications now open for ApacheCon NA 2011
To: committ...@apache.org

The Apache Software Foundation (ASF)'s Travel Assistance Committee (TAC) is
now accepting applications for ApacheCon North America 2011, 7-11 November
in Vancouver BC, Canada.

The TAC is seeking individuals from the Apache community at-large --users,
developers, educators, students, Committers, and Members-- who would like to
attend ApacheCon, but need some financial support in order to be able to get
there. There are limited places available, and all applicants will be scored
on their individual merit.

Financial assistance is available to cover flights/trains, accommodation and
entrance fees either in part or in full, depending on circumstances.
However, the support available for those attending only the BarCamp (7-8
November) is less than that for those attending the entire event (Conference
+ BarCamp 7-11 November). The Travel Assistance Committee aims to support
all official ASF events, including cross-project activities; as such, it may
be prudent for those in Asia and Europe to wait for an event geographically
closer to them.

More information can be found at http://www.apache.org/travel/index.html
including a link to the online application and detailed instructions for
submitting.

Applications will close on 8 July 2011 at 22:00 BST (UTC/GMT +1).

We wish good luck to all those who will apply, and thank you in advance for
tweeting, blogging, and otherwise spreading the word.

Regards,
The Travel Assistance Committee


[Lucene.Net] [FWD] Travel Assistance applications now open for ApacheCon NA 2011

2011-06-06 Thread Stefan Bodewig
The Apache Software Foundation (ASF)'s Travel Assistance Committee (TAC) is
now accepting applications for ApacheCon North America 2011, 7-11 November
in Vancouver BC, Canada.

The TAC is seeking individuals from the Apache community at-large --users,
developers, educators, students, Committers, and Members-- who would like to
attend ApacheCon, but need some financial support in order to be able to get
there. There are limited places available, and all applicants will be scored
on their individual merit.

Financial assistance is available to cover flights/trains, accommodation and
entrance fees either in part or in full, depending on circumstances.
However, the support available for those attending only the BarCamp (7-8
November) is less than that for those attending the entire event (Conference
+ BarCamp 7-11 November). The Travel Assistance Committee aims to support
all official ASF events, including cross-project activities; as such, it may
be prudent for those in Asia and Europe to wait for an event geographically
closer to them. 

More information can be found at http://www.apache.org/travel/index.html
including a link to the online application and detailed instructions for
submitting.

Applications will close on 8 July 2011 at 22:00 BST (UTC/GMT +1).

We wish good luck to all those who will apply, and thank you in advance for
tweeting, blogging, and otherwise spreading the word.

Regards,
The Travel Assistance Committee


[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2011-06-06 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044727#comment-13044727
 ] 

Bill Bell commented on SOLR-2242:
-

Since we changed the output of the facet_fields, the FacetComponent.java needs 
to change. This also impacts the DistribFieldFacet type. This code is not going 
to work, since price does not just have a list of numbers. It now has multiple 
lists (if we set the param). We might want to always return counts list in 
all cases. Then sharding can easily pick up on this... The DistribFieldFacet 
needs to be refactored.

{code}
lst name=facet_fields
  lst name=price
int name=numFacetTerms14/int
lst name=countsint name=0.03/intint name=11.51/intint 
name=19.951/intint name=74.991/intint name=92.01/intint 
name=179.991/intint name=185.01/intint name=279.951/intint 
name=329.951/intint name=350.01/intint name=399.01/intint 
name=479.951/intint name=649.991/intint name=2199.01/int
/lst
  /lst
/lst
{code}




 Get distinct count of names for a facet field
 -

 Key: SOLR-2242
 URL: https://issues.apache.org/jira/browse/SOLR-2242
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Affects Versions: 4.0
Reporter: Bill Bell
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, 
 SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch


 When returning facet.field=name of field you will get a list of matches for 
 distinct values. This is normal behavior. This patch tells you how many 
 distinct values you have (# of rows). Use with limit=-1 and mincount=1.
 The feature is called namedistinct. Here is an example:
 http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1
 Here is an example on field hgid (without namedistinct):
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=HGPY045FD36D4000A1/int 
   int name=HGPY0FBC6690453A91/int 
   int name=HGPY1E44ED6C4FB3B1/int 
   int name=HGPY1FA631034A1B81/int 
   int name=HGPY3317ABAC43B481/int 
   int name=HGPY3A17B2294CB5A5/int 
   int name=HGPY3ADD2B3D48C391/int 
   /lst
   /lst
 {code}
 With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
 HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
 HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows 
 (7), not the number of values (11).
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=_count_7/int 
   /lst
   /lst
 {code}
 This works actually really good to get total number of fields for a 
 group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2011-06-06 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044730#comment-13044730
 ] 

Bill Bell commented on SOLR-2242:
-

It would be easier for Sharding to not have multiple lists... I could use some 
help if we want to change it - since I have not played with FacetComponent.java.

Otherwise, it would a more simpler fix to just add it and flatten the lists.

{code}
lst name=facet_fields
  lst name=price
int name=numFacetTerms14/int
int name=0.03/intint name=11.51/intint 
name=19.951/intint name=74.991/intint name=92.01/intint 
name=179.991/intint name=185.01/intint name=279.951/intint 
name=329.951/intint name=350.01/intint name=399.01/intint 
name=479.951/intint name=649.991/intint name=2199.01/int
  /lst
/lst
{code}

Not ideal, but easier for v1 ? I could also just remove numFacetTerms=2 for now.

Will only require an if statement to ignore the type check for numFacetTerms.

Here is a patch that works with sharding.

http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price

Enjoy.

Bill





 Get distinct count of names for a facet field
 -

 Key: SOLR-2242
 URL: https://issues.apache.org/jira/browse/SOLR-2242
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Affects Versions: 4.0
Reporter: Bill Bell
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2242.patch, SOLR-2242.solr3.1.patch, 
 SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch


 When returning facet.field=name of field you will get a list of matches for 
 distinct values. This is normal behavior. This patch tells you how many 
 distinct values you have (# of rows). Use with limit=-1 and mincount=1.
 The feature is called namedistinct. Here is an example:
 http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1
 Here is an example on field hgid (without namedistinct):
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=HGPY045FD36D4000A1/int 
   int name=HGPY0FBC6690453A91/int 
   int name=HGPY1E44ED6C4FB3B1/int 
   int name=HGPY1FA631034A1B81/int 
   int name=HGPY3317ABAC43B481/int 
   int name=HGPY3A17B2294CB5A5/int 
   int name=HGPY3ADD2B3D48C391/int 
   /lst
   /lst
 {code}
 With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
 HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
 HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows 
 (7), not the number of values (11).
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=_count_7/int 
   /lst
   /lst
 {code}
 This works actually really good to get total number of fields for a 
 group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



lucene mailing list archives zip?

2011-06-06 Thread Gregor Heinrich
Dear list -- is there any archive proper of the lucene dev and user Mailman 
lists?  A link per-month or zip or tar.gz of the mbox files would be terrific.


Thanks in advance

gregor

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field

2011-06-06 Thread Bill Bell (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bell updated SOLR-2242:


Attachment: SOLR-2242.shard.patch

 Get distinct count of names for a facet field
 -

 Key: SOLR-2242
 URL: https://issues.apache.org/jira/browse/SOLR-2242
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Affects Versions: 4.0
Reporter: Bill Bell
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, 
 SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch


 When returning facet.field=name of field you will get a list of matches for 
 distinct values. This is normal behavior. This patch tells you how many 
 distinct values you have (# of rows). Use with limit=-1 and mincount=1.
 The feature is called namedistinct. Here is an example:
 http://localhost:8983/solr/select?q=*:*facet=truefacet.field=manufacet.mincount=1facet.limit=-1f.manu.facet.namedistinct=0facet.field=pricef.price.facet.namedistinct=1
 Here is an example on field hgid (without namedistinct):
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=HGPY045FD36D4000A1/int 
   int name=HGPY0FBC6690453A91/int 
   int name=HGPY1E44ED6C4FB3B1/int 
   int name=HGPY1FA631034A1B81/int 
   int name=HGPY3317ABAC43B481/int 
   int name=HGPY3A17B2294CB5A5/int 
   int name=HGPY3ADD2B3D48C391/int 
   /lst
   /lst
 {code}
 With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, 
 HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, 
 HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows 
 (7), not the number of values (11).
 {code}
 - lst name=facet_fields
 - lst name=hgid
   int name=_count_7/int 
   /lst
   /lst
 {code}
 This works actually really good to get total number of fields for a 
 group.field=hgid. Enjoy!

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2575) post.jar does not work on trunk

2011-06-06 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044739#comment-13044739
 ] 

Uwe Schindler commented on SOLR-2575:
-

The problem with the example is Jetty's caching of webapps: It caches the 
unpacked WAR file. To clean up the web application, you have to remove the 
unpacked web application in the work folder of example. Maybe clean should 
automatically do this. I got crazy because of this when modifying JSP files, 
too.

 post.jar does not work on trunk
 ---

 Key: SOLR-2575
 URL: https://issues.apache.org/jira/browse/SOLR-2575
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Bill Bell

 java -jar post.jar *.xml
 SimplePostTool: version 1.3
 SimplePostTool: POSTing files to http://localhost:8983/solr/update..
 SimplePostTool: POSTing file gb18030-example.xml
 SimplePostTool: POSTing file hd.xml
 SimplePostTool: POSTing file ipod_other.xml
 SimplePostTool: POSTing file ipod_video.xml
 SimplePostTool: POSTing file manufacturers.xml
 SimplePostTool: POSTing file mem.xml
 SimplePostTool: POSTing file monitor.xml
 SimplePostTool: POSTing file monitor2.xml
 SimplePostTool: POSTing file mp500.xml
 SimplePostTool: POSTing file sd500.xml
 SimplePostTool: POSTing file solr.xml
 SimplePostTool: POSTing file utf8-example.xml
 SimplePostTool: POSTing file vidcard.xml
 SimplePostTool: COMMITting Solr index changes..
 SimplePostTool: FATAL: Solr returned an error #500 
 java.lang.NoSuchMethodError:
 org.apache.lucene.util.CodecUtil.checkHeader(Lorg/apache/lucene/store/IndexInput
 ;Ljava/lang/String;II)I  java.lang.RuntimeException: 
 java.lang.NoSuchMethodError
 : 
 org.apache.lucene.util.CodecUtil.checkHeader(Lorg/apache/lucene/store/IndexInp
 ut;Ljava/lang/String;II)I   at 
 org.apache.solr.core.SolrCore.getSearcher(SolrCor
 e.java:1039)   at 
 org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdat
 eHandler2.java:346)   at 
 org.apache.solr.update.processor.RunUpdateProcessor.pro
 cessCommit(RunUpdateProcessorFactory.java:85)   at 
 org.apache.solr.handler.XMLLo
 ader.processUpdate(XMLLoader.java:157)   at 
 org.apache.solr.handler.XMLLoader.lo
 ad(XMLLoader.java:77)   at 
 org.apache.solr.handler.ContentStreamHandlerBase.hand
 leRequestBody(ContentStreamHandlerBase.java:67)   at 
 org.apache.solr.handler.Req
 uestHandlerBase.handleRequest(RequestHandlerBase.java:129)   at 
 org.apache.solr.
 core.SolrCore.execute(SolrCore.java:1308)   at 
 org.apache.solr.servlet.SolrDispa
 tchFilter.execute(SolrDispatchFilter.java:353)   at 
 org.apache.solr.servlet.Solr
 DispatchFilter.doFilter(SolrDispatchFilter.java:248)   at 
 org.mortbay.jetty.serv
 let.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)   at 
 org.mortb
 ay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)   at 
 org.mortbay
 .jetty.security.SecurityHandler.handle(SecurityHandler.java:216)   at 
 org.mortba
 y.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)   at 
 org.mortbay.
 jetty.handler.ContextHandler.handle(ContextHandler.java:766)   at 
 org.mortbay.je
 tty.webapp.WebAppContext.handle(WebAppContext.java:450)   at 
 org.mortbay.jetty.h
 andler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)   
 at o
 rg.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)   
 at
  org.mortbay.jetty.Server.handle(Server.java:326)   at 
 org.mortbay.jetty.HttpCon
 nection.handleRequest(HttpConnection.java:542)   at 
 org.mortbay.jetty.HttpConnec
 tion$RequestHandler

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8635 - Failure

2011-06-06 Thread Apache Jenkins Server
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8635/

1 tests failed.
REGRESSION:  
org.apache.lucene.index.TestIndexFileDeleter.testDeleteLeftoverFiles

Error Message:
CheckIndex failed

Stack Trace:
java.lang.RuntimeException: CheckIndex failed
at org.apache.lucene.util._TestUtil.checkIndex(_TestUtil.java:142)
at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:481)
at 
org.apache.lucene.index.TestIndexFileDeleter.testDeleteLeftoverFiles(TestIndexFileDeleter.java:165)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1227)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1145)




Build Log (for compile errors):
[...truncated 5069 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: lucene mailing list archives zip?

2011-06-06 Thread Lukáš Vlček
Hi,

may be there are other links but you can try the following:

The following is a link to application that can browse the archive
http://mail-archives.apache.org/mod_mbox/lucene-java-user/201106.mbox/browser

And the following is a link to the raw cumulative mail archive file for that
month
http://mail-archives.apache.org/mod_mbox/lucene-java-user/201106

So you can wget these files (but I think you should be friendly to the
server)

Regards,
Lukas

On Mon, Jun 6, 2011 at 8:54 AM, Gregor Heinrich gre...@arbylon.net wrote:

 Dear list -- is there any archive proper of the lucene dev and user
 Mailman lists?  A link per-month or zip or tar.gz of the mbox files would be
 terrific.

 Thanks in advance

 gregor

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Travel Assistance applications now open for ApacheCon NA 2011

2011-06-06 Thread Simon Willnauer
The Apache Software Foundation (ASF)'s Travel Assistance Committee (TAC) is
now accepting applications for ApacheCon North America 2011, 7-11 November
in Vancouver BC, Canada.

The TAC is seeking individuals from the Apache community at-large --users,
developers, educators, students, Committers, and Members-- who would like to
attend ApacheCon, but need some financial support in order to be able to get
there. There are limited places available, and all applicants will be scored
on their individual merit.

Financial assistance is available to cover flights/trains, accommodation and
entrance fees either in part or in full, depending on circumstances.
However, the support available for those attending only the BarCamp (7-8
November) is less than that for those attending the entire event (Conference
+ BarCamp 7-11 November). The Travel Assistance Committee aims to support
all official ASF events, including cross-project activities; as such, it may
be prudent for those in Asia and Europe to wait for an event geographically
closer to them.

More information can be found at http://www.apache.org/travel/index.html
including a link to the online application and detailed instructions for
submitting.

Applications will close on 8 July 2011 at 22:00 BST (UTC/GMT +1).

We wish good luck to all those who will apply, and thank you in advance for
tweeting, blogging, and otherwise spreading the word.

Regards,
The Travel Assistance Committee

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 8635 - Failure

2011-06-06 Thread Michael McCandless
I'll dig... somehow, strangely, it seems to be caused by the test speedups...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 6, 2011 at 4:11 AM, Apache Jenkins Server
hud...@hudson.apache.org wrote:
 Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8635/

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexFileDeleter.testDeleteLeftoverFiles

 Error Message:
 CheckIndex failed

 Stack Trace:
 java.lang.RuntimeException: CheckIndex failed
        at org.apache.lucene.util._TestUtil.checkIndex(_TestUtil.java:142)
        at 
 org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:481)
        at 
 org.apache.lucene.index.TestIndexFileDeleter.testDeleteLeftoverFiles(TestIndexFileDeleter.java:165)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1227)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1145)




 Build Log (for compile errors):
 [...truncated 5069 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 8635 - Failure

2011-06-06 Thread Michael McCandless
I committed fix.

It was a bug in the test, only uncovered because the test speedups
decreased chance that newField would turn on term vectors if you
didn't ask for it...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 6, 2011 at 6:21 AM, Michael McCandless
luc...@mikemccandless.com wrote:
 I'll dig... somehow, strangely, it seems to be caused by the test speedups...

 Mike McCandless

 http://blog.mikemccandless.com

 On Mon, Jun 6, 2011 at 4:11 AM, Apache Jenkins Server
 hud...@hudson.apache.org wrote:
 Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/8635/

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.index.TestIndexFileDeleter.testDeleteLeftoverFiles

 Error Message:
 CheckIndex failed

 Stack Trace:
 java.lang.RuntimeException: CheckIndex failed
        at org.apache.lucene.util._TestUtil.checkIndex(_TestUtil.java:142)
        at 
 org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:481)
        at 
 org.apache.lucene.index.TestIndexFileDeleter.testDeleteLeftoverFiles(TestIndexFileDeleter.java:165)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1227)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1145)




 Build Log (for compile errors):
 [...truncated 5069 lines...]



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8638 - Failure

2011-06-06 Thread Apache Jenkins Server
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/8638/

4 tests failed.
REGRESSION:  org.apache.lucene.index.TestLazyBug.testLazyWorks

Error Message:
Read past EOF

Stack Trace:
java.io.IOException: Read past EOF
at 
org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:90)
at 
org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:63)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
at org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
at 
org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:128)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:211)
at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:463)
at 
org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:565)
at org.apache.lucene.index.TestLazyBug.doTest(TestLazyBug.java:105)
at 
org.apache.lucene.index.TestLazyBug.testLazyWorks(TestLazyBug.java:129)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1362)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1280)


REGRESSION:  org.apache.lucene.index.TestLazyBug.testLazyAlsoWorks

Error Message:
Read past EOF

Stack Trace:
java.io.IOException: Read past EOF
at 
org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:90)
at 
org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:63)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
at org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
at 
org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:128)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:211)
at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:463)
at 
org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:565)
at org.apache.lucene.index.TestLazyBug.doTest(TestLazyBug.java:105)
at 
org.apache.lucene.index.TestLazyBug.testLazyAlsoWorks(TestLazyBug.java:133)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1362)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1280)


REGRESSION:  org.apache.lucene.index.TestLazyBug.testLazyBroken

Error Message:
Read past EOF

Stack Trace:
java.io.IOException: Read past EOF
at 
org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:90)
at 
org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:63)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
at org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
at 
org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:128)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:211)
at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:463)
at 
org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:565)
at org.apache.lucene.index.TestLazyBug.doTest(TestLazyBug.java:105)
at 
org.apache.lucene.index.TestLazyBug.testLazyBroken(TestLazyBug.java:137)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1362)
at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1280)


FAILED:  junit.framework.TestSuite.org.apache.lucene.index.TestLazyBug

Error Message:
MockDirectoryWrapper: cannot close: there are still open files: {_0_2.doc=3, 
_0_1.skp=3, _1_0.frq=3, _0_2.frq=3, _1_2.frq=3, _1_3.frq=3, _1_3.tib=3, 
_0_1.doc=3, _0_2.pyl=3, _1_1.tib=3, _0_0.tib=3, _0.tvd=3, _1_2.doc=3, _0.tvf=3, 
_1_0.prx=3, _0_1.frq=3, _1_2.pyl=3, _1.fdx=3, _0_3.prx=3, _0_2.skp=3, _1.fdt=3, 
_0.tvx=3, _0_2.pos=3, _1_3.prx=3, _1.nrm=3, _1_1.pyl=3, _1_0.tib=3, _0_1.tib=3, 
_1_2.tib=3, _1_1.doc=3, _0_0.prx=3, _1_1.frq=3, _0_3.frq=3, _1.tvx=3, _0.nrm=3, 
_0_0.frq=3, _1_1.pos=3, _0_3.tib=3, _0_2.tib=3, _1_1.skp=3, _1.tvf=3, 
_1_2.skp=3, _0_1.pyl=3, _0.fdx=3, _1.tvd=3, _1_2.pos=3, _0_1.pos=3, _0.fdt=3}

Stack Trace:
java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 
open files: {_0_2.doc=3, _0_1.skp=3, _1_0.frq=3, _0_2.frq=3, _1_2.frq=3, 
_1_3.frq=3, _1_3.tib=3, _0_1.doc=3, _0_2.pyl=3, _1_1.tib=3, _0_0.tib=3, 
_0.tvd=3, _1_2.doc=3, _0.tvf=3, _1_0.prx=3, _0_1.frq=3, _1_2.pyl=3, _1.fdx=3, 
_0_3.prx=3, _0_2.skp=3, _1.fdt=3, _0.tvx=3, _0_2.pos=3, _1_3.prx=3, _1.nrm=3, 
_1_1.pyl=3, _1_0.tib=3, _0_1.tib=3, _1_2.tib=3, _1_1.doc=3, _0_0.prx=3, 
_1_1.frq=3, _0_3.frq=3, _1.tvx=3, _0.nrm=3, _0_0.frq=3, _1_1.pos=3, _0_3.tib=3, 
_0_2.tib=3, _1_1.skp=3, _1.tvf=3, _1_2.skp=3, _0_1.pyl=3, _0.fdx=3, 

[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044820#comment-13044820
 ] 

Michael McCandless commented on SOLR-2564:
--

Patch looks great Martijn!

Only thing I noticed is cacheSizeMB is computed incorrectly from
maxDoc (for the -1 case), because that's all int math I think?  Ie
it'll be truncated from eg 13.7 MB - 13.  But: why not just use
Double.MAX_VALUE?


 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044821#comment-13044821
 ] 

Michael McCandless commented on SOLR-2564:
--

bq. The other use-case is more like field collapsing and does change what 
documents match (basically, only the first documents in each group, up to 
limit, match).

I'm not sure it's that simple, ie that we can so cleanly model
collapsing as reducing the docs to consider and then running faceting
on that reduced set.

EG, the use case of getting correct facet counts for a field that has
different values within the group, can't be handled by this approach?
This is the count=2 for size=S in my example at
https://issues.apache.org/jira/browse/LUCENE-3097?focusedCommentId=13038605page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13038605

I think to do that properly, the faceting impl needs to see all docs
in the group, not just the lead doc per group.

I think another way to visualize/model this that we really need to be
able to configure which field counts (ID_FIELD) for the schema.
This field would then decide all counts -- total hit count, facet
counts, etc., ie each of these counts is count(unique(ID_FIELD)) of
the docs falling in that facet/result set.  The default is Lucene's docid,
but the app should be able to state any other ID_FIELD.


 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 8638 - Failure

2011-06-06 Thread Robert Muir
bug in the test... i committed a fix

On Mon, Jun 6, 2011 at 8:14 AM, Apache Jenkins Server
hud...@hudson.apache.org wrote:
 Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/8638/

 4 tests failed.
 REGRESSION:  org.apache.lucene.index.TestLazyBug.testLazyWorks

 Error Message:
 Read past EOF

 Stack Trace:
 java.io.IOException: Read past EOF
        at 
 org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:90)
        at 
 org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:63)
        at org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
        at org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
        at 
 org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:128)
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:211)
        at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:463)
        at 
 org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:565)
        at org.apache.lucene.index.TestLazyBug.doTest(TestLazyBug.java:105)
        at 
 org.apache.lucene.index.TestLazyBug.testLazyWorks(TestLazyBug.java:129)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1362)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1280)


 REGRESSION:  org.apache.lucene.index.TestLazyBug.testLazyAlsoWorks

 Error Message:
 Read past EOF

 Stack Trace:
 java.io.IOException: Read past EOF
        at 
 org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:90)
        at 
 org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:63)
        at org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
        at org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
        at 
 org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:128)
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:211)
        at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:463)
        at 
 org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:565)
        at org.apache.lucene.index.TestLazyBug.doTest(TestLazyBug.java:105)
        at 
 org.apache.lucene.index.TestLazyBug.testLazyAlsoWorks(TestLazyBug.java:133)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1362)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1280)


 REGRESSION:  org.apache.lucene.index.TestLazyBug.testLazyBroken

 Error Message:
 Read past EOF

 Stack Trace:
 java.io.IOException: Read past EOF
        at 
 org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:90)
        at 
 org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:63)
        at org.apache.lucene.store.DataInput.readInt(DataInput.java:73)
        at org.apache.lucene.store.DataInput.readLong(DataInput.java:115)
        at 
 org.apache.lucene.store.MockIndexInputWrapper.readLong(MockIndexInputWrapper.java:128)
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:211)
        at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:463)
        at 
 org.apache.lucene.index.DirectoryReader.document(DirectoryReader.java:565)
        at org.apache.lucene.index.TestLazyBug.doTest(TestLazyBug.java:105)
        at 
 org.apache.lucene.index.TestLazyBug.testLazyBroken(TestLazyBug.java:137)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1362)
        at 
 org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1280)


 FAILED:  junit.framework.TestSuite.org.apache.lucene.index.TestLazyBug

 Error Message:
 MockDirectoryWrapper: cannot close: there are still open files: {_0_2.doc=3, 
 _0_1.skp=3, _1_0.frq=3, _0_2.frq=3, _1_2.frq=3, _1_3.frq=3, _1_3.tib=3, 
 _0_1.doc=3, _0_2.pyl=3, _1_1.tib=3, _0_0.tib=3, _0.tvd=3, _1_2.doc=3, 
 _0.tvf=3, _1_0.prx=3, _0_1.frq=3, _1_2.pyl=3, _1.fdx=3, _0_3.prx=3, 
 _0_2.skp=3, _1.fdt=3, _0.tvx=3, _0_2.pos=3, _1_3.prx=3, _1.nrm=3, _1_1.pyl=3, 
 _1_0.tib=3, _0_1.tib=3, _1_2.tib=3, _1_1.doc=3, _0_0.prx=3, _1_1.frq=3, 
 _0_3.frq=3, _1.tvx=3, _0.nrm=3, _0_0.frq=3, _1_1.pos=3, _0_3.tib=3, 
 _0_2.tib=3, _1_1.skp=3, _1.tvf=3, _1_2.skp=3, _0_1.pyl=3, _0.fdx=3, _1.tvd=3, 
 _1_2.pos=3, _0_1.pos=3, _0.fdt=3}

 Stack Trace:
 java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are 
 still open files: {_0_2.doc=3, _0_1.skp=3, _1_0.frq=3, _0_2.frq=3, 
 _1_2.frq=3, _1_3.frq=3, _1_3.tib=3, _0_1.doc=3, _0_2.pyl=3, _1_1.tib=3, 
 _0_0.tib=3, _0.tvd=3, _1_2.doc=3, _0.tvf=3, _1_0.prx=3, _0_1.frq=3, 
 _1_2.pyl=3, _1.fdx=3, _0_3.prx=3, _0_2.skp=3, _1.fdt=3, _0.tvx=3, _0_2.pos=3, 
 _1_3.prx=3, _1.nrm=3, _1_1.pyl=3, _1_0.tib=3, _0_1.tib=3, 

[jira] [Commented] (LUCENE-2454) Nested Document query support

2011-06-06 Thread Mark Harwood (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044828#comment-13044828
 ] 

Mark Harwood commented on LUCENE-2454:
--

Below are 2 example tests searching employment resumes - both using the same 
optional and mandatory clauses but in subtly different ways.
Question 1 is who has Mahout skills and preferably used them at Lucid? while 
the other question is who has Mahout skills and preferably has been employed 
by Lucid?. The questions and the answers are different. Below is the XML test 
script I used to illustrate the data/queries used, define expected results and 
run as an executable test. 
Hopefully you can make sense of this:
{code:xml}
?xml version=1.0 encoding=UTF-8?
?xml-stylesheet type=text/xsl href=test.xsl?
Test description=NestedQuery tests
Data
Index name=ResumeIndex
Analyzers 
class=org.apache.lucene.analysis.WhitespaceAnalyzer
/Analyzers
Shard name=shard1
!--  
=== --
Document pk=1
Field name=namegrant/Field
Field name=docTyperesume/Field
/Document
!--  
=== --
Document pk=2
Field 
name=employerlucid/Field
Field 
name=docTypeemployment/Field
Field 
name=skillsjava lucene/Field
/Document
!--  
=== --
Document pk=3
Field 
name=employersomewhere else/Field
Field 
name=docTypeemployment/Field
Field 
name=skillsmahout and more mahout/Field
/Document
!--  
=== --
Document pk=4
Field name=namesean/Field
Field name=docTyperesume/Field
/Document
!--  
=== --
Document pk=5
Field 
name=employerfoo bar/Field
Field 
name=docTypeemployment/Field
Field 
name=skillsjava/Field
/Document
!--  
=== --
Document pk=6
Field 
name=employersome co/Field
Field 
name=docTypeemployment/Field
Field 
name=skillsmahout mahout and more mahout/Field
/Document
/Shard
/Index
/Data
Tests
Test description=Who knows Mahout and preferably used it 
*while employed at Lucid*?
Query
NestedQuery 
!-- testing properties of individual child employment 
docs --
   Query
  BooleanQuery
Clause occurs=must
TermsQuery 
fieldName=skillsmahout/TermsQuery
/Clause
Clause occurs=should
TermsQuery 
fieldName=employerlucid/TermsQuery
/Clause
  /BooleanQuery
   /Query
   ParentsFilter  
TermsFilter 
fieldName=docTyperesume/TermsFilter
   /ParentsFilter 
/NestedQuery
/Query
ExpectedResults why=Grant's tenure at Lucid is 

[jira] [Created] (LUCENE-3176) TestNRTThreads test failure

2011-06-06 Thread Robert Muir (JIRA)
TestNRTThreads test failure
---

 Key: LUCENE-3176
 URL: https://issues.apache.org/jira/browse/LUCENE-3176
 Project: Lucene - Java
  Issue Type: Bug
 Environment: trunk
Reporter: Robert Muir


hit a fail in TestNRTThreads running tests over and over:


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3176) TestNRTThreads test failure

2011-06-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044840#comment-13044840
 ] 

Robert Muir commented on LUCENE-3176:
-

{noformat}
[junit] Testsuite: org.apache.lucene.index.TestNRTThreads
[junit] Testcase: testNRTThreads(org.apache.lucene.index.TestNRTThreads):   
FAILED
[junit] expected:8 but was:18
[junit] junit.framework.AssertionFailedError: expected:8 but was:18
[junit] at 
org.apache.lucene.index.TestNRTThreads.testNRTThreads(TestNRTThreads.java:515)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1362)
[junit] at 
org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1280)
[junit] 
[junit] 
[junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 19.812 sec
[junit] 
[junit] - Standard Output ---
[junit] doc id=157 is supposed to be deleted, but got docID=119
[junit] doc id=82 is supposed to be deleted, but got docID=68
[junit] doc id=83 is supposed to be deleted, but got docID=38
[junit] doc id=80 is supposed to be deleted, but got docID=36
[junit] doc id=81 is supposed to be deleted, but got docID=37
[junit] doc id=67 is supposed to be deleted, but got docID=24
[junit] doc id=69 is supposed to be deleted, but got docID=26
[junit] doc id=68 is supposed to be deleted, but got docID=25
[junit] doc id=672 is supposed to be deleted, but got docID=430
[junit] doc id=444 is supposed to be deleted, but got docID=344
[junit] doc id=441 is supposed to be deleted, but got docID=766
[junit] doc id=442 is supposed to be deleted, but got docID=343
[junit] doc id=443 is supposed to be deleted, but got docID=767
[junit] doc id=70 is supposed to be deleted, but got docID=67
[junit] doc id=71 is supposed to be deleted, but got docID=27
[junit] doc id=72 is supposed to be deleted, but got docID=28
[junit] doc id=73 is supposed to be deleted, but got docID=29
[junit] doc id=74 is supposed to be deleted, but got docID=30
[junit] doc id=75 is supposed to be deleted, but got docID=31
[junit] doc id=76 is supposed to be deleted, but got docID=32
[junit] doc id=219 is supposed to be deleted, but got docID=175
[junit] doc id=662 is supposed to be deleted, but got docID=425
[junit] doc id=663 is supposed to be deleted, but got docID=426
[junit] doc id=218 is supposed to be deleted, but got docID=174
[junit] doc id=361 is supposed to be deleted, but got docID=286
[junit] doc id=362 is supposed to be deleted, but got docID=287
[junit] doc id=360 is supposed to be deleted, but got docID=285
[junit] doc id=366 is supposed to be deleted, but got docID=291
[junit] doc id=365 is supposed to be deleted, but got docID=290
[junit] doc id=364 is supposed to be deleted, but got docID=289
[junit] doc id=363 is supposed to be deleted, but got docID=288
[junit] doc id=368 is supposed to be deleted, but got docID=293
[junit] doc id=367 is supposed to be deleted, but got docID=292
[junit] doc id=518 is supposed to be deleted, but got docID=361
[junit] doc id=517 is supposed to be deleted, but got docID=805
[junit] doc id=220 is supposed to be deleted, but got docID=176
[junit] doc id=324 is supposed to be deleted, but got docID=269
[junit] doc id=322 is supposed to be deleted, but got docID=268
[junit] -  ---
[junit] - Standard Error -
[junit] NOTE: reproduce with: ant test -Dtestcase=TestNRTThreads 
-Dtestmethod=testNRTThreads -Dtests.seed=0:0
[junit] NOTE: test params are: codec=RandomCodecProvider: 
{extra8=MockFixedIntBlock(blockSize=1054), 
extra9=MockVariableIntBlock(baseBlockSize=87), body=MockSep, 
extra0=MockVariableIntBlock(baseBlockSize=87), packID=Pulsing(freqCutoff=16), 
extra1=MockRandom, extra2=Standard, extra3=SimpleText, 
date=MockVariableIntBlock(baseBlockSize=87), extra4=MockSep, 
extra5=Pulsing(freqCutoff=16), extra6=MockFixedIntBlock(blockSize=1054), 
extra7=MockVariableIntBlock(baseBlockSize=87), 
docid=MockVariableIntBlock(baseBlockSize=87), title=SimpleText, 
titleTokenized=Standard}, locale=ar_JO, timezone=Europe/Oslo
[junit] NOTE: all tests run in this JVM:
[junit] [TestSearchForDuplicates, TestMockAnalyzer, TestCheckIndex, 
TestDoc, TestFlex, TestIndexReaderCloneNorms, TestIndexWriterExceptions, 
TestIndexWriterUnicode, TestMultiLevelSkipList, TestNRTThreads]
[junit] NOTE: Mac OS X 10.6.7 x86_64/Apple Inc. 1.6.0_24 
(64-bit)/cpus=4,threads=1,free=41147720,total=85000192
[junit] -  ---
[junit] TEST org.apache.lucene.index.TestNRTThreads FAILED
{noformat}

 TestNRTThreads test failure
 ---


[jira] [Commented] (LUCENE-2645) False assertion of 0 position delta in StandardPostingsWriterImpl

2011-06-06 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044860#comment-13044860
 ] 

David Smiley commented on LUCENE-2645:
--

Thanks for the test Korusaka. I didn't realize my bug report last year that an 
assert condition's  should become = was insufficient for a committer to 
simply make the 1-char change. I guess I should work on creating tests for 
nearly everything for my bug reports to get more traction. :-|

 False assertion of 0 position delta in StandardPostingsWriterImpl
 --

 Key: LUCENE-2645
 URL: https://issues.apache.org/jira/browse/LUCENE-2645
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: David Smiley
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LuceneTrunkAssertErrorReproducer.java


 StandardPostingsWriterImpl line 159 is:
 {code:java}
 assert delta  0 || position == 0 || position == -1: position= + 
 position +  lastPosition= + lastPosition;// not quite right (if 
 pos=0 is repeated twice we don't catch it)
 {code}
 I enable assertions when I run my unit tests and I've found this assertion to 
 fail when delta is 0 which occurs when the same position value is sent in 
 twice in arrow.  Once I added RemoveDuplicatesTokenFilter, this problem went 
 away.  Should I really be forced to add this filter?  I think delta = 0 
 would be a better assertion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044867#comment-13044867
 ] 

Yonik Seeley commented on SOLR-2564:


bq.  The other use-case is more like field collapsing and does change what 
documents match (basically, only the first documents in each group, up to 
limit, match).

bq. I'm not sure it's that simple, ie that we can so cleanly model collapsing 
as reducing the docs to consider and then running faceting on that reduced set.

I *think* that's what was actually implemented in SOLR-236 IIRC, and what some 
people seem to be asking for.

bq. EG, the use case of getting correct facet counts for a field that has 
different values within the group, can't be handled by this approach?

Well, correct is a matter of context ;-) (for example, some have called the 
facet counts for the current grouping implementation incorrect because it 
didn't happen to match their use case).  Looking at the original description in 
LUCENE-3097, it seems you're talking about Martijn's 3rd method, while I was 
talking about the 2nd.  But maybe some people that were originally advocating 
for #2, really wanted #3?



 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Joan Codina (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044895#comment-13044895
 ] 

Joan Codina commented on SOLR-2399:
---

I did some changes to the current version of the Schema-Browser, some time 
ago.. you can find it in this issue #SOLR-2440. 
It has some features that I found interesting:
A: Drill down, so you can select a word in the list of most common words and 
perform a query.
B. Select the list of fields to be the output of the query.

apart from sorting and showing the field names in alphabetical order and not 
capitalised.


 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-admin-interface.patch, SOLR-2399-fluid-width.patch, 
 SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 (!) As Erick pointed out .. Chrome's XML-Capabilities are a bit odd, so it 
 does not render Raw-XML-Data (like we're using for displaying the Schema and 
 Config-File) -- instead it looks like this: 
 http://files.mathe.is/solr-admin/00_chrome-xml.png ; so it would be really 
 nice, to see the 
 [xinclude-Interface|http://files.mathe.is/solr-admin/xinclude/] there :)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2462) Using spellcheck.collate can result in extremely high memory usage

2011-06-06 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2462:
-

Attachment: SOLR-2462.patch

I guess I should have run that one myself too.  This test is very similar to 
the ones in SpellCheckCollatorTest.  I guess while the ones in SCCT test 
whether or not it can collate properly, TSCR checks that the response it sends 
back is proper.

In any case, this is just another one of my brittle tests!  Because we're using 
a different comparator, results with tied scores don't come back exactly the 
same as before.  So now this test needs more than 5 tries to find the 2nd valid 
collation.  I up'ed it from 5 to 10 and now it passes.

 Using spellcheck.collate can result in extremely high memory usage
 --

 Key: SOLR-2462
 URL: https://issues.apache.org/jira/browse/SOLR-2462
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 3.1
Reporter: James Dyer
Assignee: Robert Muir
Priority: Critical
 Fix For: 3.1.1, 4.0

 Attachments: SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, 
 SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, 
 SOLR-2462.patch, SOLR-2462.patch, SOLR-2462_3_1.patch


 When using spellcheck.collate, class SpellPossibilityIterator creates a 
 ranked list of *every* possible correction combination.  But if returning 
 several corrections per term, and if several words are misspelled, the 
 existing algorithm uses a huge amount of memory.
 This bug was introduced with SOLR-2010.  However, it is triggered anytime 
 spellcheck.collate is used.  It is not necessary to use any features that 
 were added with SOLR-2010.
 We were in Production with Solr for 1 1/2 days and this bug started taking 
 our Solr servers down with infinite GC loops.  It was pretty easy for this 
 to happen as occasionally a user will accidently paste the URL into the 
 Search box on our app.  This URL results in a search with ~12 misspelled 
 words.  We have spellcheck.count set to 15. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2571:
-

Attachment: SOLR-2571.patch

This version takes all of DirectSolrSpellChecker's parameters as Integer and 
Float objects rather than Strings, as appropriate.  Also, I changed the 
accuracy parameter to use SpellingParams.SPELLCHECK_ACCURACY ... I'm not sure 
if this would have validated any unit tests (I didn't see any tests that use 
DirectSolrSpellChecker).

I think this will make DirectSolrSpellChecker more consistent with the rest of 
solrconfig.xmls parameter requirements.  The only better option than this, 
maybe, would to make it flexible and allow either the Int/Float or String in 
these cases.  I think this later option is not necessary however.

 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
 SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2645) False assertion of 0 position delta in StandardPostingsWriterImpl

2011-06-06 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044918#comment-13044918
 ] 

Michael McCandless commented on LUCENE-2645:


While test cases are always welcome, they certainly are not necessary in a 
patch (Yonik's Law of Patches).

Which issue had you opened before?  Somehow it fell through the cracks... 
which, unfortunately, happens all the time in open-source.  Best to bump/gently 
nag on important fixes...

 False assertion of 0 position delta in StandardPostingsWriterImpl
 --

 Key: LUCENE-2645
 URL: https://issues.apache.org/jira/browse/LUCENE-2645
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: David Smiley
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LuceneTrunkAssertErrorReproducer.java


 StandardPostingsWriterImpl line 159 is:
 {code:java}
 assert delta  0 || position == 0 || position == -1: position= + 
 position +  lastPosition= + lastPosition;// not quite right (if 
 pos=0 is repeated twice we don't catch it)
 {code}
 I enable assertions when I run my unit tests and I've found this assertion to 
 fail when delta is 0 which occurs when the same position value is sent in 
 twice in arrow.  Once I added RemoveDuplicatesTokenFilter, this problem went 
 away.  Should I really be forced to add this filter?  I think delta = 0 
 would be a better assertion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044917#comment-13044917
 ] 

James Dyer commented on SOLR-2571:
--

{quote}
what makes this 'decision' of correctlySpelled? Do you know?
{quote}

I took a quick look to find out.  Its more complicated than I thought!  Here's 
the basic jist (I think!) :
 - If the instance of SolrSpellChecker returns frequency data and all 
suggestions have frequency 0, TRUE.
 - If the instance of SolrSpellChecker returns frequency data and any 
suggestion have frequency == 0, FALSE.
 - If the instance of SolrSpellChecker returns NO frequency data but has 
suggestions, OMIT.
 - If the instance of SolrSpellChecker returns NO suggestions, FALSE. 

Possibly this isn't fully accurate but I'm at least mostly correct here.  Seems 
like the discrepency with DirectSolrSpellChecker is because it isn't returning 
Frequency info?

This all happens in SpellCheckComponent.toNamedList() ... I'm guessing the code 
here uses the presence or absence of frequency data as kind of a proxy 
indicator whether or not its dealing with IndexBasedSpellChecker or 
FileBasedSpellChecker.  Possibly it would be better if each instance of 
SolrSpellChecker had a isCorrectlySpelled() method that toNamedList() could 
call?  Maybe I should I go open another jira issue for that?


 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
 SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3176) TestNRTThreads test failure

2011-06-06 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless reassigned LUCENE-3176:
--

Assignee: Michael McCandless

 TestNRTThreads test failure
 ---

 Key: LUCENE-3176
 URL: https://issues.apache.org/jira/browse/LUCENE-3176
 Project: Lucene - Java
  Issue Type: Bug
 Environment: trunk
Reporter: Robert Muir
Assignee: Michael McCandless

 hit a fail in TestNRTThreads running tests over and over:

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2645) False assertion of 0 position delta in StandardPostingsWriterImpl

2011-06-06 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044926#comment-13044926
 ] 

David Smiley commented on LUCENE-2645:
--

bq. Which issue had you opened before?

This one! ;-) -- But if you want to give Korusaka credit for it because he 
submitted a patch then fine. He went the extra mile that I didn't think was 
necessary.

 False assertion of 0 position delta in StandardPostingsWriterImpl
 --

 Key: LUCENE-2645
 URL: https://issues.apache.org/jira/browse/LUCENE-2645
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: David Smiley
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LuceneTrunkAssertErrorReproducer.java


 StandardPostingsWriterImpl line 159 is:
 {code:java}
 assert delta  0 || position == 0 || position == -1: position= + 
 position +  lastPosition= + lastPosition;// not quite right (if 
 pos=0 is repeated twice we don't catch it)
 {code}
 I enable assertions when I run my unit tests and I've found this assertion to 
 fail when delta is 0 which occurs when the same position value is sent in 
 twice in arrow.  Once I added RemoveDuplicatesTokenFilter, this problem went 
 away.  Should I really be forced to add this filter?  I think delta = 0 
 would be a better assertion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044927#comment-13044927
 ] 

Michael McCandless commented on SOLR-2564:
--

I think a good criteria for correct is if you were to click through on the 
facet (ie, take the current query and add a filter on facet field = facet 
value), would the hit count you see match the facet count you were just looking 
at?

Ie, drill down should be consistent.

Both approaches will give the same facets counts if the field never varies 
within the group (ie, the field belongs to the parent docs); it's only 
child fields where you need faceting to be aware of the groups, so for apps 
that never display facets on child fields, only computing facets on the group 
heads will work.

I suspect doc blocks will be the only practical way to implement faceting on 
child fields efficiently.

 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2645) False assertion of 0 position delta in StandardPostingsWriterImpl

2011-06-06 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044928#comment-13044928
 ] 

Michael McCandless commented on LUCENE-2645:


D'oh!  Woops :)  I didn't see that you had opened this issue!  And I missed it 
from last September... sorry :(

I will add you to CHANGES.

And no that extra mile is not necessary.  Just some gentle nagging would help 
stuff not fall past the event horizons on our todo lists :)

 False assertion of 0 position delta in StandardPostingsWriterImpl
 --

 Key: LUCENE-2645
 URL: https://issues.apache.org/jira/browse/LUCENE-2645
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: David Smiley
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LuceneTrunkAssertErrorReproducer.java


 StandardPostingsWriterImpl line 159 is:
 {code:java}
 assert delta  0 || position == 0 || position == -1: position= + 
 position +  lastPosition= + lastPosition;// not quite right (if 
 pos=0 is repeated twice we don't catch it)
 {code}
 I enable assertions when I run my unit tests and I've found this assertion to 
 fail when delta is 0 which occurs when the same position value is sent in 
 twice in arrow.  Once I added RemoveDuplicatesTokenFilter, this problem went 
 away.  Should I really be forced to add this filter?  I think delta = 0 
 would be a better assertion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2645) False assertion of 0 position delta in StandardPostingsWriterImpl

2011-06-06 Thread KuroSaka TeruHiko (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044929#comment-13044929
 ] 

KuroSaka TeruHiko commented on LUCENE-2645:
---

Thank you, Michael, for quick fix, and David, for initially reporting this bug 
and giving me a credit :-)


 False assertion of 0 position delta in StandardPostingsWriterImpl
 --

 Key: LUCENE-2645
 URL: https://issues.apache.org/jira/browse/LUCENE-2645
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: David Smiley
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LuceneTrunkAssertErrorReproducer.java


 StandardPostingsWriterImpl line 159 is:
 {code:java}
 assert delta  0 || position == 0 || position == -1: position= + 
 position +  lastPosition= + lastPosition;// not quite right (if 
 pos=0 is repeated twice we don't catch it)
 {code}
 I enable assertions when I run my unit tests and I've found this assertion to 
 fail when delta is 0 which occurs when the same position value is sent in 
 twice in arrow.  Once I added RemoveDuplicatesTokenFilter, this problem went 
 away.  Should I really be forced to add this filter?  I think delta = 0 
 would be a better assertion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2645) False assertion of 0 position delta in StandardPostingsWriterImpl

2011-06-06 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044930#comment-13044930
 ] 

Michael McCandless commented on LUCENE-2645:


Thank you both :)

 False assertion of 0 position delta in StandardPostingsWriterImpl
 --

 Key: LUCENE-2645
 URL: https://issues.apache.org/jira/browse/LUCENE-2645
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: David Smiley
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LuceneTrunkAssertErrorReproducer.java


 StandardPostingsWriterImpl line 159 is:
 {code:java}
 assert delta  0 || position == 0 || position == -1: position= + 
 position +  lastPosition= + lastPosition;// not quite right (if 
 pos=0 is repeated twice we don't catch it)
 {code}
 I enable assertions when I run my unit tests and I've found this assertion to 
 fail when delta is 0 which occurs when the same position value is sent in 
 twice in arrow.  Once I added RemoveDuplicatesTokenFilter, this problem went 
 away.  Should I really be forced to add this filter?  I think delta = 0 
 would be a better assertion.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044940#comment-13044940
 ] 

Martijn van Groningen commented on SOLR-2564:
-

bq. But: why not just use Double.MAX_VALUE?
Yes, I should have used that and I'll change that. I thought that the size was 
initially used to create the underline array. But it isn't! The array inside 
the caching collector initially starts with a length 128 and grows when needed.

How I've currently implemented LUCENE-3097 is that it will only get the most 
relevant document of each group. In terms of SOLR-236 that is the same as using 
collapse.threshold=1. I think what Yonik means is increasing the threshold so 
more documents and up in the docset, that eventually is used by the facet 
component. Increasing this threshold also means setting when to start to 
collapse. So when setting the collapse.threshold=3 this means that from the 4th 
document the collapsing starts. I think that the whole collaps.threshold 
feature doesn't scale very well. 

Anyway, I think when we go wire the 2nd method (LUCENE-3097) into Solr, we 
should first make it work for the most relevant group documents.

 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2011-06-06 Thread Steven Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044971#comment-13044971
 ] 

Steven Rowe commented on SOLR-1844:
---

Hi David,

The link in the description is dead - this one mentioned the new400common.txt 
file: http://www.hathitrust.org/node/181 but I'm not sure it's what you were 
after.

Looks like this is the sample you're talking about: 
http://www.hathitrust.org/blogs/large-scale-search/common-word-list-commongrams 
- I can see the comma deliminted values there.

Would you care to make a patch?

 CommonGramsQueryFilterFactory should read words in a comma-delimited format
 ---

 Key: SOLR-1844
 URL: https://issues.apache.org/jira/browse/SOLR-1844
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: David Smiley
Priority: Minor

 CommonGramsQueryFilterFactory expects that the file(s) given to the words 
 argument is a carriage-return delimited list of words.  It doesn't support 
 comments either.  This file format should be more flexible to support comma 
 delimited values.  I came across this because I was trying to use the sample 
 file provided by HathiTrust:
 http://www.hathitrust.org/node/180(named in a file new400common.txt)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3176) TestNRTThreads test failure

2011-06-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044978#comment-13044978
 ] 

Robert Muir commented on LUCENE-3176:
-

on my machine: this one is tough to reproduce.

if I run the test by itself, it seems to pass.

however, if my machine is busy (e.g. running ant test-core -Dtests.seed=0:0), 
then it fails!

 TestNRTThreads test failure
 ---

 Key: LUCENE-3176
 URL: https://issues.apache.org/jira/browse/LUCENE-3176
 Project: Lucene - Java
  Issue Type: Bug
 Environment: trunk
Reporter: Robert Muir
Assignee: Michael McCandless

 hit a fail in TestNRTThreads running tests over and over:

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3176) TestNRTThreads test failure

2011-06-06 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044981#comment-13044981
 ] 

Jason Rutherglen commented on LUCENE-3176:
--

It's probably the new DWPT code.  There was a specific issue to fix this 
problem LUCENE-2956.

 TestNRTThreads test failure
 ---

 Key: LUCENE-3176
 URL: https://issues.apache.org/jira/browse/LUCENE-3176
 Project: Lucene - Java
  Issue Type: Bug
 Environment: trunk
Reporter: Robert Muir
Assignee: Michael McCandless

 hit a fail in TestNRTThreads running tests over and over:

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2011-06-06 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044988#comment-13044988
 ] 

David Smiley commented on SOLR-1844:


On second thought, I think the current behavior is fine because it's consistent 
with the other filters that need lists of words since they all share the same 
code to do it -- BaseTokenStreamFactory.getWordSet(...). If any change should 
happen, it should happen there. I'm fine with this issue being closed as 
Won't-Fix.  It was easy enough for me to simply replace the commas in Hathi's 
file with a carriage return.

 CommonGramsQueryFilterFactory should read words in a comma-delimited format
 ---

 Key: SOLR-1844
 URL: https://issues.apache.org/jira/browse/SOLR-1844
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: David Smiley
Priority: Minor

 CommonGramsQueryFilterFactory expects that the file(s) given to the words 
 argument is a carriage-return delimited list of words.  It doesn't support 
 comments either.  This file format should be more flexible to support comma 
 delimited values.  I came across this because I was trying to use the sample 
 file provided by HathiTrust:
 http://www.hathitrust.org/node/180(named in a file new400common.txt)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2011-06-06 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe resolved SOLR-1844.
---

Resolution: Won't Fix
  Assignee: Steven Rowe

Thanks David.

 CommonGramsQueryFilterFactory should read words in a comma-delimited format
 ---

 Key: SOLR-1844
 URL: https://issues.apache.org/jira/browse/SOLR-1844
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: David Smiley
Assignee: Steven Rowe
Priority: Minor

 CommonGramsQueryFilterFactory expects that the file(s) given to the words 
 argument is a carriage-return delimited list of words.  It doesn't support 
 comments either.  This file format should be more flexible to support comma 
 delimited values.  I came across this because I was trying to use the sample 
 file provided by HathiTrust:
 http://www.hathitrust.org/node/180(named in a file new400common.txt)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045002#comment-13045002
 ] 

Yonik Seeley commented on SOLR-2564:


Browsing around this a bit more... the existing solr code selected the string 
based collectors for any ValueSource of StrFieldSource.  This patch resorts to 
exact getClass() checks against string and text fields which won't match in as 
many cases (either derived fields, or user custom fields that don't derive from 
either of the these field types)

 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-2399:


Attachment: SOLR-2399-sorting-fields.patch
SOLR-2399-analysis-stopwords.patch

Erick,

bq. This one is odd, adding stopwords seems to break analysis...

Hm, did not handle removed tokens correctly. Patch attached

bq. Oh, for the drop-down for choosing fields or types, would it be possible to 
order them alphabetically like the schema browser?

Yes, Patch attached too.

bq. BTW, this whole effort is a long-needed makeover, I'm glad you've taken it 
on. Can I do something other than complain?

Thanks :) For me, it's just fine .. continue using the interface, as you 
normally would use it. We need just more feedback, other usecases, other 
schema-/field-definitions, ... -- to see where things are not working as 
expected!

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, 
 SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, 
 SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 (!) As Erick pointed out .. Chrome's XML-Capabilities are a bit odd, so it 
 does not render Raw-XML-Data (like we're using for displaying the Schema and 
 Config-File) -- instead it looks like this: 
 http://files.mathe.is/solr-admin/00_chrome-xml.png ; so it would be really 
 nice, to see the 
 [xinclude-Interface|http://files.mathe.is/solr-admin/xinclude/] there :)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045017#comment-13045017
 ] 

Stefan Matheis (steffkes) commented on SOLR-2399:
-

Joan,

bq. A: Drill down, so you can select a word in the list of most common words 
and perform a query.
Added to the list, will extend the Query-Form so that's possible to predefine 
Field-Values

bq. B. Select the list of fields to be the output of the query.
Regarding your patch, that's directly related, right? Selected Field will be 
used for the query and is the only listed field for the {{fl=}} param. Never 
though about that, thanks - will add this also.

bq. apart from sorting and showing the field names in alphabetical order and 
not capitalised.
thats the current state, the values are just taken from the /admin/luke-Handler

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, 
 SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, 
 SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 (!) As Erick pointed out .. Chrome's XML-Capabilities are a bit odd, so it 
 does not render Raw-XML-Data (like we're using for displaying the Schema and 
 Config-File) -- instead it looks like this: 
 http://files.mathe.is/solr-admin/00_chrome-xml.png ; so it would be really 
 nice, to see the 
 [xinclude-Interface|http://files.mathe.is/solr-admin/xinclude/] there :)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045019#comment-13045019
 ] 

Ryan McKinley commented on SOLR-2399:
-

check revision: 1132724

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-admin-interface.patch, SOLR-2399-analysis-stopwords.patch, 
 SOLR-2399-fluid-width.patch, SOLR-2399-sorting-fields.patch, 
 SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 (!) As Erick pointed out .. Chrome's XML-Capabilities are a bit odd, so it 
 does not render Raw-XML-Data (like we're using for displaying the Schema and 
 Config-File) -- instead it looks like this: 
 http://files.mathe.is/solr-admin/00_chrome-xml.png ; so it would be really 
 nice, to see the 
 [xinclude-Interface|http://files.mathe.is/solr-admin/xinclude/] there :)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045027#comment-13045027
 ] 

Yonik Seeley commented on SOLR-2564:


I've been checking out the performance, and it generally seems fine.  But of 
course we normally short circuit based on comparators and often don't get 
beyond that... so to exercise  isolate the rest of the code, I tried a 
worst-case scenario where the short circuit wouldn't work (sort=_docid_ desc) 
and solr trunk with this patch is ~16% slower than without it.  Any ideas what 
the problem might be?

{code}
http://localhost:8983/solr/select?q=*:*sort=_docid_ 
descgroup=truegroup.cacheMB=0group.field=single1000_i
{code}

Note: the single1000_i field is a single valued int field with 1000 unique 
values

 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2462) Using spellcheck.collate can result in extremely high memory usage

2011-06-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045028#comment-13045028
 ] 

Robert Muir commented on SOLR-2462:
---

Thanks for the explanation and updated patch James... I'll test this out 
shortly!

 Using spellcheck.collate can result in extremely high memory usage
 --

 Key: SOLR-2462
 URL: https://issues.apache.org/jira/browse/SOLR-2462
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 3.1
Reporter: James Dyer
Assignee: Robert Muir
Priority: Critical
 Fix For: 3.1.1, 4.0

 Attachments: SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, 
 SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, 
 SOLR-2462.patch, SOLR-2462.patch, SOLR-2462_3_1.patch


 When using spellcheck.collate, class SpellPossibilityIterator creates a 
 ranked list of *every* possible correction combination.  But if returning 
 several corrections per term, and if several words are misspelled, the 
 existing algorithm uses a huge amount of memory.
 This bug was introduced with SOLR-2010.  However, it is triggered anytime 
 spellcheck.collate is used.  It is not necessary to use any features that 
 were added with SOLR-2010.
 We were in Production with Solr for 1 1/2 days and this bug started taking 
 our Solr servers down with infinite GC loops.  It was pretty easy for this 
 to happen as occasionally a user will accidently paste the URL into the 
 Search box on our app.  This URL results in a search with ~12 misspelled 
 words.  We have spellcheck.count set to 15. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045029#comment-13045029
 ] 

Robert Muir commented on SOLR-2571:
---

{quote}
This version takes all of DirectSolrSpellChecker's parameters as Integer and 
Float objects rather than Strings, as appropriate. 
{quote}

Did you maybe upload an older patch? I took a look and it only seems to cutover 
the threshold param.

{quote}
I'm not sure if this would have validated any unit tests (I didn't see any 
tests that use DirectSolrSpellChecker).
{quote}

There is a test (DirectSolrSpellCheckerTest), but its probably not that great :)


 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
 SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045031#comment-13045031
 ] 

Robert Muir commented on SOLR-2571:
---

{quote}
Possibly this isn't fully accurate but I'm at least mostly correct here. Seems 
like the discrepency with DirectSolrSpellChecker is because it isn't returning 
Frequency info?
{quote}

This sounds like a bug, care to open a separate issue on it? (we can resolve 
the int/float stuff here on this one).

The thing certainly intends to return freq info...
{noformat}
SuggestWord[] suggestions = checker.suggestSimilar(new Term(field, 
token.toString()), 
  options.count, options.reader, options.onlyMorePopular, accuracy);
  for (SuggestWord suggestion : suggestions)
result.add(token, suggestion.string, suggestion.freq);
{noformat}

 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
 SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread James Dyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer updated SOLR-2571:
-

Attachment: SOLR-2571.patch

Here is that patch with Ints/Floats instead of Strings.  I made a tiny 
adjustment to the unit test also.

 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
 SOLR-2571.patch, SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-2399:


Description: 
*The idea was to create a new, fresh (and hopefully clean) Solr Admin 
Interface.* [Based on this 
[ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]

*Features:*
* [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
* [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
* [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
* [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
SOLR-2400)
* [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
* [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
* [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
* [Replication|http://files.mathe.is/solr-admin/10_replication.png]
* [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
* [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
** Stub (using static data)

Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

I've quickly created a Github-Repository (Just for me, to keep track of the 
changes)
» https://github.com/steffkes/solr-admin

  was:
*The idea was to create a new, fresh (and hopefully clean) Solr Admin 
Interface.* [Based on this 
[ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]

*Features:*
* [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
* [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
* [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
* [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
SOLR-2400)
* [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
* [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
* [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
* [Replication|http://files.mathe.is/solr-admin/10_replication.png]
* [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
* [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
** Stub (using static data)

(!) As Erick pointed out .. Chrome's XML-Capabilities are a bit odd, so it does 
not render Raw-XML-Data (like we're using for displaying the Schema and 
Config-File) -- instead it looks like this: 
http://files.mathe.is/solr-admin/00_chrome-xml.png ; so it would be really 
nice, to see the 
[xinclude-Interface|http://files.mathe.is/solr-admin/xinclude/] there :)

Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

I've quickly created a Github-Repository (Just for me, to keep track of the 
changes)
» https://github.com/steffkes/solr-admin


 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-110606.patch, SOLR-2399-admin-interface.patch, 
 SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, 
 SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: 

[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Stefan Matheis (steffkes) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-2399:


Attachment: SOLR-2399-110606.patch

bq. check revision: 1132724
Yes, works.



Attached Patch fixes a few smaller Things:

* bq. Ryan: schema browser has a funny character to the right of Please Select

* bq. Ryan: sometimes the schema-browser page does not load when i click on it 
-- perhaps because it has a '-' in the name?

* Also on the Schema/Config Page, i've replaced the iframe -- which just shows 
the raw xml files -- through the javascript highlighter (already used for 
dataimport-config), so will now work also in chrome (w/o extensions). still 
missing the xinclude-feature -- feedback anyone?

* New Style for 'Ping' in Navigation, if the {{/admin/ping}} handler is not 
available - like in example + multicore-mode.

* Core-Admin is now also fluid-width aware

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-110606.patch, SOLR-2399-admin-interface.patch, 
 SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, 
 SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 (!) As Erick pointed out .. Chrome's XML-Capabilities are a bit odd, so it 
 does not render Raw-XML-Data (like we're using for displaying the Schema and 
 Config-File) -- instead it looks like this: 
 http://files.mathe.is/solr-admin/00_chrome-xml.png ; so it would be really 
 nice, to see the 
 [xinclude-Interface|http://files.mathe.is/solr-admin/xinclude/] there :)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reassigned SOLR-2571:
-

Assignee: Robert Muir

 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Assignee: Robert Muir
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
 SOLR-2571.patch, SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045079#comment-13045079
 ] 

Michael McCandless commented on SOLR-2564:
--

Hmmm.  Was this with or without caching?

 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2462) Using spellcheck.collate can result in extremely high memory usage

2011-06-06 Thread James Dyer (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045082#comment-13045082
 ] 

James Dyer commented on SOLR-2462:
--

I added spellcheck.maxCollationEvaluations to the wiki.  Thanks, Robert for 
taking time helping get this fixed!

 Using spellcheck.collate can result in extremely high memory usage
 --

 Key: SOLR-2462
 URL: https://issues.apache.org/jira/browse/SOLR-2462
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 3.1
Reporter: James Dyer
Assignee: Robert Muir
Priority: Critical
 Fix For: 3.3, 4.0

 Attachments: SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, 
 SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, SOLR-2462.patch, 
 SOLR-2462.patch, SOLR-2462.patch, SOLR-2462_3_1.patch


 When using spellcheck.collate, class SpellPossibilityIterator creates a 
 ranked list of *every* possible correction combination.  But if returning 
 several corrections per term, and if several words are misspelled, the 
 existing algorithm uses a huge amount of memory.
 This bug was introduced with SOLR-2010.  However, it is triggered anytime 
 spellcheck.collate is used.  It is not necessary to use any features that 
 were added with SOLR-2010.
 We were in Production with Solr for 1 1/2 days and this bug started taking 
 our Solr servers down with infinite GC loops.  It was pretty easy for this 
 to happen as occasionally a user will accidently paste the URL into the 
 Search box on our app.  This URL results in a search with ~12 misspelled 
 words.  We have spellcheck.count set to 15. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2564) Integrating grouping module into Solr 4.0

2011-06-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045087#comment-13045087
 ] 

Yonik Seeley commented on SOLR-2564:


This was without caching to put them on an even footing (and given that the 
base query was all docs, caching would be slower anyway).  The URL above was 
the actual one used to test.

 Integrating grouping module into Solr 4.0
 -

 Key: SOLR-2564
 URL: https://issues.apache.org/jira/browse/SOLR-2564
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
 Fix For: 4.0

 Attachments: LUCENE-2564.patch, SOLR-2564.patch, SOLR-2564.patch, 
 SOLR-2564.patch, SOLR-2564.patch


 Since work on grouping module is going well. I think it is time to wire this 
 up in Solr.
 Besides the current grouping features Solr provides, Solr will then also 
 support second pass caching and total count based on groups.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045094#comment-13045094
 ] 

Ryan McKinley commented on SOLR-2399:
-

trying to apply this patch, i get:
the chunk size did not match the number of added /removed lines!

any ideas? Did you make this patch differently then before?

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-110606.patch, SOLR-2399-admin-interface.patch, 
 SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, 
 SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1736) DateTools.java general improvements

2011-06-06 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-1736:
-

Attachment: LUCENE-1736_DateTools_improvements.patch

This is an updated patch.
* The former DateFormats class was used as a value in a ThreadLocal which 
isn't a good idea as it hampers class reloading.
* Improvements to a switch statement to benefit from fall-through.
* Removed a pointless conversion to Calendar in timeToString()
* Moved functionality to Resolution enum, and used arrays of Resolutions 
indexed by format length instead of large if-else or switch blocks for format  
parse. The ramification is 48 fewer lines of code.

 DateTools.java general improvements
 ---

 Key: LUCENE-1736
 URL: https://issues.apache.org/jira/browse/LUCENE-1736
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 2.9
Reporter: David Smiley
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-1736_DateTools_improvements.patch, 
 cleanerDateTools.patch


 Applying the attached patch shows the improvements to DateTools.java that I 
 think should be done. All logic that does anything at all is moved to 
 instance methods of the inner class Resolution. I argue this is more 
 object-oriented.
 1. In cases where Resolution is an argument to the method, I can simply 
 invoke the appropriate call on the Resolution object. Formerly there was a 
 big branch if/else.
 2. Instead of synchronized being used seemingly everywhere, synchronized is 
 used to sync on the object that is not threadsafe, be it a DateFormat or 
 Calendar instance.
 3. Since different DateFormat and Calendar instances are created 
 per-Resolution, there is now less lock contention since threads using 
 different resolutions will not use the same locks.
 4. The old implementation of timeToString rounded the time before formatting 
 it. That's unnecessary since the format only includes the resolution desired.
 5. round() now uses a switch statement that benefits from fall-through (no 
 break).
 Another debatable improvement that could be made is putting the resolution 
 instances into an array indexed by format length. This would mean I could 
 remove the switch in lookupResolutionByLength() and avoid the length 
 constants there. Maybe that would be a bit too over-engineered when the 
 switch is fine.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Stefan Matheis (steffkes) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045147#comment-13045147
 ] 

Stefan Matheis (steffkes) commented on SOLR-2399:
-

bq. any ideas? Did you make this patch differently then before?
hm, not really :/ same commands as the patches before. applying it locally, 
works as expected:

{code}$ patch -p0  SOLR-2399-110606.patch
patching file solr/src/webapp/web/tpl/schema-browser.html
patching file solr/src/webapp/web/tpl/dataimport.html
patching file solr/src/webapp/web/tpl/cores.html
patching file solr/src/webapp/web/css/screen.css
patching file solr/src/webapp/web/css/syntax.css
patching file solr/src/webapp/web/js/script.js{code}

will have a look on this tomorrow, sorry ryan

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-110606.patch, SOLR-2399-admin-interface.patch, 
 SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, 
 SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2136) Function Queries: if() function

2011-06-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045148#comment-13045148
 ] 

Jan Høydahl commented on SOLR-2136:
---

Great Yonik!
Is it possible to have exists() work on multi valued fields too without 
crashing?

 Function Queries: if() function
 ---

 Key: SOLR-2136
 URL: https://issues.apache.org/jira/browse/SOLR-2136
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4.1
Reporter: Jan Høydahl
 Fix For: 4.0

 Attachments: SOLR-2136.patch, SOLR-2136.patch


 Add an if() function which will enable conditional function queries.
 The function could be modeled after a spreadsheet if function (e.g: 
 http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_IF_function)
 IF(test; value1; value2) where:
 test is or refers to a logical value or expression that returns a logical 
 value (TRUE or FALSE).
 value1 is the value that is returned by the function if test yields TRUE.
 value2 is the value that is returned by the function if test yields FALSE.
 If value2 is omitted it is assumed to be FALSE; if value1 is also omitted it 
 is assumed to be TRUE.
 Example use:
 if(color==red; 100; if(color==green; 50; 25))
 This function will check the document field color, and if it is red 
 return 100, if it is green return 50, else return 25.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [jira] [Resolved] (SOLR-1844) CommonGramsQueryFilterFactory should read words in a comma-delimited format

2011-06-06 Thread Burton-West, Tom
Hi David,

Just curious about your use of the HathiTrust list.  I usually explain to 
people that it's customized to our index and they are probably better off 
making their own list based on the lists of stop words appropriate for the 
languages in their index (sources listed in the blog post 
http://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance)  
If you already have an index built and are re-indexing with CommonGrams , you 
can also use the -t flag with HighFreqTerms.java in lucene contrib to determine 
the words that have the largest position lists and are therefore candidates to 
be added to your CommonGrams word list.  We recently ran HighFreqTerms.java 
against our indexes and discovered that it would be better to remove some of 
the less frequent foreign language stopwords and instead use some very frequent 
words from the index.

Tom Burton-West
www.hathitrust.org/blogs

From: Steven Rowe (JIRA) [j...@apache.org]
Sent: Monday, June 06, 2011 2:08 PM
To: dev@lucene.apache.org
Subject: [jira] [Resolved] (SOLR-1844) CommonGramsQueryFilterFactory should 
read words in a comma-delimited format

 [ 
https://issues.apache.org/jira/browse/SOLR-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe resolved SOLR-1844.
---

Resolution: Won't Fix
  Assignee: Steven Rowe

Thanks David.

 CommonGramsQueryFilterFactory should read words in a comma-delimited format
 ---

 Key: SOLR-1844
 URL: https://issues.apache.org/jira/browse/SOLR-1844
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 1.4
Reporter: David Smiley
Assignee: Steven Rowe
Priority: Minor

 CommonGramsQueryFilterFactory expects that the file(s) given to the words 
 argument is a carriage-return delimited list of words.  It doesn't support 
 comments either.  This file format should be more flexible to support comma 
 delimited values.  I came across this because I was trying to use the sample 
 file provided by HathiTrust:
 http://www.hathitrust.org/node/180(named in a file new400common.txt)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-1736) DateTools.java general improvements

2011-06-06 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe reassigned LUCENE-1736:
---

Assignee: Steven Rowe

 DateTools.java general improvements
 ---

 Key: LUCENE-1736
 URL: https://issues.apache.org/jira/browse/LUCENE-1736
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 2.9
Reporter: David Smiley
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-1736_DateTools_improvements.patch, 
 cleanerDateTools.patch


 Applying the attached patch shows the improvements to DateTools.java that I 
 think should be done. All logic that does anything at all is moved to 
 instance methods of the inner class Resolution. I argue this is more 
 object-oriented.
 1. In cases where Resolution is an argument to the method, I can simply 
 invoke the appropriate call on the Resolution object. Formerly there was a 
 big branch if/else.
 2. Instead of synchronized being used seemingly everywhere, synchronized is 
 used to sync on the object that is not threadsafe, be it a DateFormat or 
 Calendar instance.
 3. Since different DateFormat and Calendar instances are created 
 per-Resolution, there is now less lock contention since threads using 
 different resolutions will not use the same locks.
 4. The old implementation of timeToString rounded the time before formatting 
 it. That's unnecessary since the format only includes the resolution desired.
 5. round() now uses a switch statement that benefits from fall-through (no 
 break).
 Another debatable improvement that could be made is putting the resolution 
 instances into an array indexed by format length. This would mean I could 
 remove the switch in lookupResolutionByLength() and avoid the length 
 constants there. Maybe that would be a bit too over-engineered when the 
 switch is fine.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1736) DateTools.java general improvements

2011-06-06 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe updated LUCENE-1736:


Attachment: LUCENE-1736.patch

David, this is your patch with a CHANGES.txt entry and a couple of comments 
added (for javadocs next to the two imports that are javadocs-only; and 
formatLen spelled out over the shared format string).

Nice improvements.  All tests pass.

I plan on committing shortly.

 DateTools.java general improvements
 ---

 Key: LUCENE-1736
 URL: https://issues.apache.org/jira/browse/LUCENE-1736
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 2.9
Reporter: David Smiley
Assignee: Steven Rowe
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-1736.patch, 
 LUCENE-1736_DateTools_improvements.patch, cleanerDateTools.patch


 Applying the attached patch shows the improvements to DateTools.java that I 
 think should be done. All logic that does anything at all is moved to 
 instance methods of the inner class Resolution. I argue this is more 
 object-oriented.
 1. In cases where Resolution is an argument to the method, I can simply 
 invoke the appropriate call on the Resolution object. Formerly there was a 
 big branch if/else.
 2. Instead of synchronized being used seemingly everywhere, synchronized is 
 used to sync on the object that is not threadsafe, be it a DateFormat or 
 Calendar instance.
 3. Since different DateFormat and Calendar instances are created 
 per-Resolution, there is now less lock contention since threads using 
 different resolutions will not use the same locks.
 4. The old implementation of timeToString rounded the time before formatting 
 it. That's unnecessary since the format only includes the resolution desired.
 5. round() now uses a switch statement that benefits from fall-through (no 
 break).
 Another debatable improvement that could be made is putting the resolution 
 instances into an array indexed by format length. This would mean I could 
 remove the switch in lookupResolutionByLength() and avoid the length 
 constants there. Maybe that would be a bit too over-engineered when the 
 switch is fine.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-1736) DateTools.java general improvements

2011-06-06 Thread Steven Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rowe resolved LUCENE-1736.
-

   Resolution: Fixed
Fix Version/s: 3.3

Committed:
- r1132806: trunk
- r1132812: branch_3x

Thanks David!

 DateTools.java general improvements
 ---

 Key: LUCENE-1736
 URL: https://issues.apache.org/jira/browse/LUCENE-1736
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 2.9
Reporter: David Smiley
Assignee: Steven Rowe
Priority: Minor
 Fix For: 3.3, 4.0

 Attachments: LUCENE-1736.patch, 
 LUCENE-1736_DateTools_improvements.patch, cleanerDateTools.patch


 Applying the attached patch shows the improvements to DateTools.java that I 
 think should be done. All logic that does anything at all is moved to 
 instance methods of the inner class Resolution. I argue this is more 
 object-oriented.
 1. In cases where Resolution is an argument to the method, I can simply 
 invoke the appropriate call on the Resolution object. Formerly there was a 
 big branch if/else.
 2. Instead of synchronized being used seemingly everywhere, synchronized is 
 used to sync on the object that is not threadsafe, be it a DateFormat or 
 Calendar instance.
 3. Since different DateFormat and Calendar instances are created 
 per-Resolution, there is now less lock contention since threads using 
 different resolutions will not use the same locks.
 4. The old implementation of timeToString rounded the time before formatting 
 it. That's unnecessary since the format only includes the resolution desired.
 5. round() now uses a switch statement that benefits from fall-through (no 
 break).
 Another debatable improvement that could be made is putting the resolution 
 instances into an array indexed by format length. This would mean I could 
 remove the switch in lookupResolutionByLength() and avoid the length 
 constants there. Maybe that would be a bit too over-engineered when the 
 switch is fine.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045192#comment-13045192
 ] 

Ryan McKinley commented on SOLR-2399:
-

Ok, i tried on linux and it applied OK.  TortiseSVN sometimes barfs when it 
shouldnt.

committed in revision: 1132826

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-110606.patch, SOLR-2399-admin-interface.patch, 
 SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, 
 SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-06-06 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045194#comment-13045194
 ] 

Ryan McKinley commented on SOLR-2399:
-

few minor comments...

* on http://localhost:8983/solr/#/singlecore/schema-browser/field/text in the 
Top 10/405 Terms: with the more/less links.  I'm not sure adding 10 at a time 
is really useful.  I would rather click 'more' and get all 50, and have 'less' 
just go back to 10.

* Navigation to Schema Browser works great -- thanks

again, this is great stuff.  thanks!


 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Assignee: Ryan McKinley
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399-110603-2.patch, SOLR-2399-110603.patch, 
 SOLR-2399-110606.patch, SOLR-2399-admin-interface.patch, 
 SOLR-2399-analysis-stopwords.patch, SOLR-2399-fluid-width.patch, 
 SOLR-2399-sorting-fields.patch, SOLR-2399-wip-notice.patch, SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 *Features:*
 * [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png]
 * [Query-Form|http://files.mathe.is/solr-admin/02_query.png]
 * [Plugins|http://files.mathe.is/solr-admin/05_plugins.png]
 * [Analysis|http://files.mathe.is/solr-admin/04_analysis.png] (SOLR-2476, 
 SOLR-2400)
 * [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png]
 * [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png] (SOLR-2482)
 * [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 * [Replication|http://files.mathe.is/solr-admin/10_replication.png]
 * [Zookeeper|http://files.mathe.is/solr-admin/11_cloud.png]
 * [Logging|http://files.mathe.is/solr-admin/07_logging.png] (SOLR-2459)
 ** Stub (using static data)
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2136) Function Queries: if() function

2011-06-06 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045230#comment-13045230
 ] 

Yonik Seeley commented on SOLR-2136:


bq. Is it possible to have exists() work on multi valued fields too without 
crashing?

Not currently... but note that exists() works on subqueries too, not just 
fields.

So a slow way to do it would be
{code}
  ...exists(query($qq))qq=myfield:[* TO *]
{code}

Or a faster workaround could be to index a special EXISTS token or EMPTY token 
and do
{code}
  ...exists(query($qq))qq=myfield:EXISTS
{code}

See the test code in TestFunctionQuery for an easy way to use pseudo-fields to 
test this stuff.




 Function Queries: if() function
 ---

 Key: SOLR-2136
 URL: https://issues.apache.org/jira/browse/SOLR-2136
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.4.1
Reporter: Jan Høydahl
 Fix For: 4.0

 Attachments: SOLR-2136.patch, SOLR-2136.patch


 Add an if() function which will enable conditional function queries.
 The function could be modeled after a spreadsheet if function (e.g: 
 http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_IF_function)
 IF(test; value1; value2) where:
 test is or refers to a logical value or expression that returns a logical 
 value (TRUE or FALSE).
 value1 is the value that is returned by the function if test yields TRUE.
 value2 is the value that is returned by the function if test yields FALSE.
 If value2 is omitted it is assumed to be FALSE; if value1 is also omitted it 
 is assumed to be TRUE.
 Example use:
 if(color==red; 100; if(color==green; 50; 25))
 This function will check the document field color, and if it is red 
 return 100, if it is green return 50, else return 25.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved SOLR-2571.
---

Resolution: Fixed

Committed revision 1132855 (trunk).
I organized the constants in DirectSolrSpellchecker a bit, so its easy to see 
which ones are 'shared' with the others and which ones are unique to it.

Committed revision 1132856 (branch_3x).
I backported the test and example here. In the case of this test, it needed to 
clearIndex() in setup() like trunk does, so I merged these bits also.

Thanks James!

 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Assignee: Robert Muir
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch, 
 SOLR-2571.patch, SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2491) spellcheck.maxCollationTries breaks when using FieldCollapsing

2011-06-06 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045248#comment-13045248
 ] 

Robert Muir commented on SOLR-2491:
---

James: any opinion on this with regards to SOLR-2564?

I'm totally lost when it comes to grouping, but do you still think collation 
should use ungrouped queries or should we wait on SOLR-2564, which seems to 
suggest it can return this count... I could be confused here and haven't looked 
in detail though.



 spellcheck.maxCollationTries breaks when using FieldCollapsing
 --

 Key: SOLR-2491
 URL: https://issues.apache.org/jira/browse/SOLR-2491
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.0
Reporter: James Dyer
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2491.patch


 If specifying spellcheck.maxCollationTries and group=true on the same 
 query, you never get any Spell Check Collations back.  The problem is that 
 SpellCheckCollator relies on ResponseBuilder.getToLog().get(hits) to see 
 how many results each test query returns.  When group=true, the toLog 
 isn't populated so SpellCheckCollator is unable to find a collation that can 
 return results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3176) TestNRTThreads test failure

2011-06-06 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045283#comment-13045283
 ] 

Simon Willnauer commented on LUCENE-3176:
-

phew! This seems like a delete issue. I only looked at the output robert posted 
so far but it seems that a FrozenDelPackage gets lost somewhere here

I will look after buzzwords

 TestNRTThreads test failure
 ---

 Key: LUCENE-3176
 URL: https://issues.apache.org/jira/browse/LUCENE-3176
 Project: Lucene - Java
  Issue Type: Bug
 Environment: trunk
Reporter: Robert Muir
Assignee: Michael McCandless

 hit a fail in TestNRTThreads running tests over and over:

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org