date:20140227


 [ 
https://issues.apache.org/jira/browse/LUCENE-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Audenaerde updated LUCENE-5474:
---

Attachment: SimpleFacetsExample.java

Yes, that prevents some duplicate stuff. Here is the modified file.

 Add example for retrieving facet counts without retrieving documents
 

 Key: LUCENE-5474
 URL: https://issues.apache.org/jira/browse/LUCENE-5474
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Affects Versions: 4.7
Reporter: Rob Audenaerde
 Attachments: FacetOnlyExample.java, SimpleFacetsExample.java


 In the examples of facetting the {{FacetsCollector.search()}} is used. There 
 are use cases where you do not need the documents that match the search. 
 It would be nice if there is an example showing this.
 Basically, it comes down to using {{searcher.search(query, null /* Filter */, 
 facetCollector)}}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.

2014-02-27 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914315#comment-13914315
 ] 

Noble Paul commented on SOLR-5781:
--

How do we plan to do this? on a per call basis or on a cluster-wide property?

 Make the Collections API timeout configurable.
 --

 Key: SOLR-5781
 URL: https://issues.apache.org/jira/browse/SOLR-5781
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
 Fix For: 4.8, 5.0


 This would also help with tests - nightlies can be quite intensive and need a 
 very high timeout.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5474) Add example for retrieving facet counts without retrieving documents

2014-02-27 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914322#comment-13914322
 ] 

Shai Erera commented on LUCENE-5474:


Looks good. Could you please:

* Create a .patch (diff) file is it's easier to note what you modified/added?
* Can you add a test to TestSimpleFacetsExample, along the lines of testSimple?

 Add example for retrieving facet counts without retrieving documents
 

 Key: LUCENE-5474
 URL: https://issues.apache.org/jira/browse/LUCENE-5474
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Affects Versions: 4.7
Reporter: Rob Audenaerde
 Attachments: FacetOnlyExample.java, SimpleFacetsExample.java


 In the examples of facetting the {{FacetsCollector.search()}} is used. There 
 are use cases where you do not need the documents that match the search. 
 It would be nice if there is an example showing this.
 Basically, it comes down to using {{searcher.search(query, null /* Filter */, 
 facetCollector)}}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5476) Facet sampling

Rob Audenaerde created LUCENE-5476:
--

 Summary: Facet sampling
 Key: LUCENE-5476
 URL: https://issues.apache.org/jira/browse/LUCENE-5476
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Rob Audenaerde


With LUCENE-5339 facet sampling disappeared. 

When trying to display facet counts on large datasets (10M documents) counting 
facets is rather expensive, as all the hits are collected and processed. 

Sampling greatly reduced this and thus provided a nice speedup. Could it be 
brought back?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: VOTE: Release apache-solr-ref-guide-4.7.pdf (RC1)

2014-02-27 Thread Steve Rowe

I regularized the preformatted blocks, so that none now have an extra blank 
line at the top - for some reason, all of the boxes on about half of the cwiki 
pages had this issue, but the other half didn’t.  (I had to edit the source 
format to achieve this, and even there I couldn’t see a difference between 
preformatted boxes with an extra leading blank line and those without - must 
have been some invisible whitespace, not sure what.) 

Once I’d done that, the content still wasn’t vertically centered, so I edited 
the Solr space’s export PDF stylesheet and got rid of the negative margin-top 
thing I’d put in place for a previous release, which didn’t appear to be having 
any effect anymore, and instead overrode the default CSS to adjust the padding 
on preformatted blocks and their containing div-s.  Content in preformatted 
blocks now appears to be vertically centered, with no extra vertical space.

I also tried to apply “page-break-inside: avoid” in several places to see if it 
would help with the few poorly distributed multi-page preformatted boxes, but 
it didn’t seem to help.

I noticed that a couple of “Topics covered in this section” boxes are too 
narrow to allow their content to be legible - the ones on pages 251 and 300 
look really bad, and some others are only marginally legible.

I don’t know if these issues are worthy of respin - I didn’t address the latter 
one.

Steve

On Feb 26, 2014, at 11:59 AM, Cassandra Targett casstarg...@gmail.com wrote:

 I generated a new release candidate for the Solr Reference Guide. This fixes 
 the page numbering problem and a few other minor edits folks made yesterday 
 after I generated RC0.
 
 https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.7-RC1/
 
 +1 from me.
 
 Cassandra


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5476) Facet sampling


[ 
https://issues.apache.org/jira/browse/LUCENE-5476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914415#comment-13914415
 ] 

Michael McCandless commented on LUCENE-5476:


+1 to bring it back.

I think we could expose methods that take a FBS and either sub-sample it in 
place, or return a new FBS?

 Facet sampling
 --

 Key: LUCENE-5476
 URL: https://issues.apache.org/jira/browse/LUCENE-5476
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Rob Audenaerde

 With LUCENE-5339 facet sampling disappeared. 
 When trying to display facet counts on large datasets (10M documents) 
 counting facets is rather expensive, as all the hits are collected and 
 processed. 
 Sampling greatly reduced this and thus provided a nice speedup. Could it be 
 brought back?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life

2014-02-27 Thread Marcus Engene (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914417#comment-13914417
 ] 

Marcus Engene commented on SOLR-5733:
-

Hi, 

Going from 

$ java -version
java version 1.6.0_18
OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)

...to...

$ java -version
java version 1.6.0_27
OpenJDK Runtime Environment (IcedTea6 1.12.6) (6b27-1.12.6-1~deb6u1)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)

Seems to kill off the problem.

Thanks,
Marcus

 Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of 
 their life
 ---

 Key: SOLR-5733
 URL: https://issues.apache.org/jira/browse/SOLR-5733
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6.1
 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 
 UTC 2013 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Marcus Engene
 Fix For: 4.6.1


 tien@solrssd2:/solr461stem/example$ cat start.sh
 #!/bin/sh
 java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 
 -jar start.jar 2/dev/null 1/dev/null
 Solr crashes spontaneously about every 2nd start within the first 10min of 
 the process life.
 tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data
 5405556   data
 Machine is not heavily used
 Tasks: 317 total,   1 running, 316 sleeping,   0 stopped,   0 zombie
 Cpu(s):  1.3%us,  0.0%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
 Mem:  264660644k total, 227656492k used, 37004152k free,   544848k buffers
 Swap:  4000144k total,   102940k used,  3897204k free, 204332940k cached
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND  
 
  7700 tien  20   0 32.4g 3.3g 1.2g S   13  1.3   2:23.15 java 
  
  8208 tien  20   0 27.6g 3.9g 805m S   10  1.5   0:56.45 java 
  
  7785 tien  20   0 26.7g 5.6g 2.2g S2  2.2   3:42.94 java 
  
  6102 tien  20   0 27.6g 9.9g 4.3g S0  3.9  61:03.26 java 
  
  8337 tien  20   0 19204 1552 1016 R0  0.0   0:00.02 top  
  
 1 root  20   0  8356  796  664 S0  0.0   0:12.90 init 
  
 2 root  20   0 000 S0  0.0   0:00.00 kthreadd 
  
 3 root  RT   0 000 S0  0.0   0:05.30 migration/0  
  
 4 root  20   0 000 S0  0.0   0:13.17 ksoftirqd/0  
  
 5 root  RT   0 000 S0  0.0   0:00.00 watchdog/0
 I'll try to attach the hs-dump.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life

2014-02-27 Thread Marcus Engene (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Engene closed SOLR-5733.
---

   Resolution: Done
Fix Version/s: 4.6.1

 Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of 
 their life
 ---

 Key: SOLR-5733
 URL: https://issues.apache.org/jira/browse/SOLR-5733
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6.1
 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 
 UTC 2013 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Marcus Engene
 Fix For: 4.6.1


 tien@solrssd2:/solr461stem/example$ cat start.sh
 #!/bin/sh
 java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 
 -jar start.jar 2/dev/null 1/dev/null
 Solr crashes spontaneously about every 2nd start within the first 10min of 
 the process life.
 tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data
 5405556   data
 Machine is not heavily used
 Tasks: 317 total,   1 running, 316 sleeping,   0 stopped,   0 zombie
 Cpu(s):  1.3%us,  0.0%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
 Mem:  264660644k total, 227656492k used, 37004152k free,   544848k buffers
 Swap:  4000144k total,   102940k used,  3897204k free, 204332940k cached
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND  
 
  7700 tien  20   0 32.4g 3.3g 1.2g S   13  1.3   2:23.15 java 
  
  8208 tien  20   0 27.6g 3.9g 805m S   10  1.5   0:56.45 java 
  
  7785 tien  20   0 26.7g 5.6g 2.2g S2  2.2   3:42.94 java 
  
  6102 tien  20   0 27.6g 9.9g 4.3g S0  3.9  61:03.26 java 
  
  8337 tien  20   0 19204 1552 1016 R0  0.0   0:00.02 top  
  
 1 root  20   0  8356  796  664 S0  0.0   0:12.90 init 
  
 2 root  20   0 000 S0  0.0   0:00.00 kthreadd 
  
 3 root  RT   0 000 S0  0.0   0:05.30 migration/0  
  
 4 root  20   0 000 S0  0.0   0:13.17 ksoftirqd/0  
  
 5 root  RT   0 000 S0  0.0   0:00.00 watchdog/0
 I'll try to attach the hs-dump.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester

2014-02-27 Thread ASF subversion and git services (JIRA)

Michael McCandless created LUCENE-5477:
--

 Summary: add near-real-time suggest building to 
AnalyzingInfixSuggester
 Key: LUCENE-5477
 URL: https://issues.apache.org/jira/browse/LUCENE-5477
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Michael McCandless
 Fix For: 4.8, 5.0


Because this suggester impl. is just a Lucene index under-the-hood, it should 
be straightforward to enable near-real-time additions/removals of suggestions.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries

2014-02-27 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5478:


Attachment: LUCENE-5478.patch

here is a patch

 Allow CommonTermsQuery to create custom term queries
 

 Key: LUCENE-5478
 URL: https://issues.apache.org/jira/browse/LUCENE-5478
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.7
Reporter: Simon Willnauer
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5478.patch


 currently we create term queries with _new TermQuery(..)_ directly in 
 _CommonTermsQuery_ I'd like to extend the creation of the term query just 
 like you can do that in the the query parser.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life

2014-02-27 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914429#comment-13914429
 ] 

Uwe Schindler commented on SOLR-5733:
-

bq. If you do find that you're using one of the known bad Java versions, please
come back and close this JIRA.

The list is here: http://wiki.apache.org/lucene-java/JavaBugs

In general, if you really want to use Java 6 (which is no longer supported by 
Oracle), update to at least 1.6.0 u45 (latest available). In addition, OpenJDK 
1.6 has major performance problems because of missing patches from official JDK 
6. If you want to use OpenJDK, use OpenJDK 7, which is identical on the 
patch-level and features for server applications with Oracle JDK 7.

 Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of 
 their life
 ---

 Key: SOLR-5733
 URL: https://issues.apache.org/jira/browse/SOLR-5733
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6.1
 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 
 UTC 2013 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Marcus Engene
 Fix For: 4.6.1


 tien@solrssd2:/solr461stem/example$ cat start.sh
 #!/bin/sh
 java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 
 -jar start.jar 2/dev/null 1/dev/null
 Solr crashes spontaneously about every 2nd start within the first 10min of 
 the process life.
 tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data
 5405556   data
 Machine is not heavily used
 Tasks: 317 total,   1 running, 316 sleeping,   0 stopped,   0 zombie
 Cpu(s):  1.3%us,  0.0%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
 Mem:  264660644k total, 227656492k used, 37004152k free,   544848k buffers
 Swap:  4000144k total,   102940k used,  3897204k free, 204332940k cached
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND  
 
  7700 tien  20   0 32.4g 3.3g 1.2g S   13  1.3   2:23.15 java 
  
  8208 tien  20   0 27.6g 3.9g 805m S   10  1.5   0:56.45 java 
  
  7785 tien  20   0 26.7g 5.6g 2.2g S2  2.2   3:42.94 java 
  
  6102 tien  20   0 27.6g 9.9g 4.3g S0  3.9  61:03.26 java 
  
  8337 tien  20   0 19204 1552 1016 R0  0.0   0:00.02 top  
  
 1 root  20   0  8356  796  664 S0  0.0   0:12.90 init 
  
 2 root  20   0 000 S0  0.0   0:00.00 kthreadd 
  
 3 root  RT   0 000 S0  0.0   0:05.30 migration/0  
  
 4 root  20   0 000 S0  0.0   0:13.17 ksoftirqd/0  
  
 5 root  RT   0 000 S0  0.0   0:00.00 watchdog/0
 I'll try to attach the hs-dump.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries

2014-02-27 Thread Simon Willnauer (JIRA)

Simon Willnauer created LUCENE-5478:
---

 Summary: Allow CommonTermsQuery to create custom term queries
 Key: LUCENE-5478
 URL: https://issues.apache.org/jira/browse/LUCENE-5478
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.7
Reporter: Simon Willnauer
 Fix For: 4.8, 5.0


currently we create term queries with _new TermQuery(..)_ directly in 
_CommonTermsQuery_ I'd like to extend the creation of the term query just like 
you can do that in the the query parser.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries

2014-02-27 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914438#comment-13914438
 ] 

Uwe Schindler commented on LUCENE-5478:
---

Cool, +1

 Allow CommonTermsQuery to create custom term queries
 

 Key: LUCENE-5478
 URL: https://issues.apache.org/jira/browse/LUCENE-5478
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.7
Reporter: Simon Willnauer
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5478.patch


 currently we create term queries with _new TermQuery(..)_ directly in 
 _CommonTermsQuery_ I'd like to extend the creation of the term query just 
 like you can do that in the the query parser.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5609) Don't let cores create slices/named replicas


[ 
https://issues.apache.org/jira/browse/SOLR-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914446#comment-13914446
 ] 

ASF subversion and git services commented on SOLR-5609:
---

Commit 1572530 from [~noble.paul] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1572530 ]

SOLR-5609 use coreNodeName to compare replicas, 
CollectionsAPIDistributedZkTest.testCollectionsAPI() randomly switches to 
legacyCloud=false

 Don't let cores create slices/named replicas
 

 Key: SOLR-5609
 URL: https://issues.apache.org/jira/browse/SOLR-5609
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
 Fix For: 4.8, 5.0

 Attachments: SOLR-5609.patch, SOLR-5609.patch, SOLR-5609_5130.patch, 
 SOLR-5609_5130.patch, SOLR-5609_5130.patch, SOLR-5609_5130.patch


 In SolrCloud, it is possible for a core to come up in any node , and register 
 itself with an arbitrary slice/coreNodeName. This is a legacy requirement and 
 we would like to make it only possible for Overseer to initiate creation of 
 slice/replicas
 We plan to introduce cluster level properties at the top level
 /cluster-props.json
 {code:javascript}
 {
 noSliceOrReplicaByCores:true
 }
 {code}
 If this property is set to true, cores won't be able to send STATE commands 
 with unknown slice/coreNodeName . Those commands will fail at Overseer. This 
 is useful for SOLR-5310 / SOLR-5311 where a core/replica is deleted by a 
 command and  it comes up later and tries to create a replica/slice



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5474) Add example for retrieving facet counts without retrieving documents


 [ 
https://issues.apache.org/jira/browse/LUCENE-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Audenaerde updated LUCENE-5474:
---

Attachment: LUCENE-5474.patch

Here is the patch

 Add example for retrieving facet counts without retrieving documents
 

 Key: LUCENE-5474
 URL: https://issues.apache.org/jira/browse/LUCENE-5474
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Affects Versions: 4.7
Reporter: Rob Audenaerde
 Attachments: FacetOnlyExample.java, LUCENE-5474.patch, 
 SimpleFacetsExample.java


 In the examples of facetting the {{FacetsCollector.search()}} is used. There 
 are use cases where you do not need the documents that match the search. 
 It would be nice if there is an example showing this.
 Basically, it comes down to using {{searcher.search(query, null /* Filter */, 
 facetCollector)}}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5479) Make default dimension config in FacetConfig adjustable

Rob Audenaerde created LUCENE-5479:
--

 Summary: Make default dimension config in FacetConfig adjustable 
 Key: LUCENE-5479
 URL: https://issues.apache.org/jira/browse/LUCENE-5479
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Rob Audenaerde
Priority: Minor
 Attachments: LUCENE-5479.patch

Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard 
approaches. 

However, I use lots of facets. These facets can be multivalued, I do not know 
that on beforehand. So what I would like to do is to change the default config 
to {{mulitvalued = true}}. 

Currently I have a working, but rather ugly workaround that subclasses 
FacetConfig, like this:

{code:title=CustomFacetConfig.java|borderStyle=solid}
public class CustomFacetsConfig extends FacetsConfig
{
public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig();
static
{
DEFAULT_D2A_DIM_CONFIG.multiValued = true;
}

@Override
public synchronized DimConfig getDimConfig( String dimName )
{
DimConfig ft = super.getDimConfig( dimName );

if ( DEFAULT_DIM_CONFIG.equals( ft ) )
{
return DEFAULT_D2A_DIM_CONFIG;
}
return ft;
}
}
{code}

I created a patch to illustrate what I would like to change.

Also, maybe there are better way to accomplish my goal (easy default to 
multivalue?)




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5479) Make default dimension config in FacetConfig adjustable


 [ 
https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Audenaerde updated LUCENE-5479:
---

Attachment: LUCENE-5479.patch

 Make default dimension config in FacetConfig adjustable 
 

 Key: LUCENE-5479
 URL: https://issues.apache.org/jira/browse/LUCENE-5479
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Rob Audenaerde
Priority: Minor
 Attachments: LUCENE-5479.patch


 Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most 
 standard approaches. 
 However, I use lots of facets. These facets can be multivalued, I do not know 
 that on beforehand. So what I would like to do is to change the default 
 config to {{mulitvalued = true}}. 
 Currently I have a working, but rather ugly workaround that subclasses 
 FacetConfig, like this:
 {code:title=CustomFacetConfig.java|borderStyle=solid}
 public class CustomFacetsConfig extends FacetsConfig
 {
   public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig();
   static
   {
   DEFAULT_D2A_DIM_CONFIG.multiValued = true;
   }
   @Override
   public synchronized DimConfig getDimConfig( String dimName )
   {
   DimConfig ft = super.getDimConfig( dimName );
   if ( DEFAULT_DIM_CONFIG.equals( ft ) )
   {
   return DEFAULT_D2A_DIM_CONFIG;
   }
   return ft;
   }
 }
 {code}
 I created a patch to illustrate what I would like to change.
 Also, maybe there are better way to accomplish my goal (easy default to 
 multivalue?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5479) Make default dimension config in FacetConfig adjustable


 [ 
https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Audenaerde updated LUCENE-5479:
---

Description: 
Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard 
approaches. 

However, I use lots of facets. These facets can be multivalued, I do not know 
that on beforehand. So what I would like to do is to change the default config 
to {{mulitvalued = true}}. 

Currently I have a working, but rather ugly workaround that subclasses 
FacetConfig, like this:

{code:title=CustomFacetConfig.java|borderStyle=solid}
public class CustomFacetsConfig extends FacetsConfig
{
public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig();
static
{
DEFAULT_D2A_DIM_CONFIG.multiValued = true;
}

@Override
public synchronized DimConfig getDimConfig( String dimName )
{
DimConfig ft = super.getDimConfig( dimName );

if ( DEFAULT_DIM_CONFIG.equals( ft ) )
{
return DEFAULT_D2A_DIM_CONFIG;
}
return ft;
}
}
{code}

I created a patch to illustrate what I would like to change. By making a 
protected method it is easier to create a custom subclass of FacetConfig. 

Also, maybe there are better way to accomplish my goal (easy default to 
multivalue?)


  was:
Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most standard 
approaches. 

However, I use lots of facets. These facets can be multivalued, I do not know 
that on beforehand. So what I would like to do is to change the default config 
to {{mulitvalued = true}}. 

Currently I have a working, but rather ugly workaround that subclasses 
FacetConfig, like this:

{code:title=CustomFacetConfig.java|borderStyle=solid}
public class CustomFacetsConfig extends FacetsConfig
{
public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig();
static
{
DEFAULT_D2A_DIM_CONFIG.multiValued = true;
}

@Override
public synchronized DimConfig getDimConfig( String dimName )
{
DimConfig ft = super.getDimConfig( dimName );

if ( DEFAULT_DIM_CONFIG.equals( ft ) )
{
return DEFAULT_D2A_DIM_CONFIG;
}
return ft;
}
}
{code}

I created a patch to illustrate what I would like to change.

Also, maybe there are better way to accomplish my goal (easy default to 
multivalue?)



 Make default dimension config in FacetConfig adjustable 
 

 Key: LUCENE-5479
 URL: https://issues.apache.org/jira/browse/LUCENE-5479
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Rob Audenaerde
Priority: Minor
 Attachments: LUCENE-5479.patch


 Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most 
 standard approaches. 
 However, I use lots of facets. These facets can be multivalued, I do not know 
 that on beforehand. So what I would like to do is to change the default 
 config to {{mulitvalued = true}}. 
 Currently I have a working, but rather ugly workaround that subclasses 
 FacetConfig, like this:
 {code:title=CustomFacetConfig.java|borderStyle=solid}
 public class CustomFacetsConfig extends FacetsConfig
 {
   public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig();
   static
   {
   DEFAULT_D2A_DIM_CONFIG.multiValued = true;
   }
   @Override
   public synchronized DimConfig getDimConfig( String dimName )
   {
   DimConfig ft = super.getDimConfig( dimName );
   if ( DEFAULT_DIM_CONFIG.equals( ft ) )
   {
   return DEFAULT_D2A_DIM_CONFIG;
   }
   return ft;
   }
 }
 {code}
 I created a patch to illustrate what I would like to change. By making a 
 protected method it is easier to create a custom subclass of FacetConfig. 
 Also, maybe there are better way to accomplish my goal (easy default to 
 multivalue?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5479) Make default dimension config in FacetConfig adjustable


[ 
https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914473#comment-13914473
 ] 

Michael McCandless commented on LUCENE-5479:


+1, makes sense!

 Make default dimension config in FacetConfig adjustable 
 

 Key: LUCENE-5479
 URL: https://issues.apache.org/jira/browse/LUCENE-5479
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Rob Audenaerde
Priority: Minor
 Attachments: LUCENE-5479.patch


 Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most 
 standard approaches. 
 However, I use lots of facets. These facets can be multivalued, I do not know 
 that on beforehand. So what I would like to do is to change the default 
 config to {{mulitvalued = true}}. 
 Currently I have a working, but rather ugly workaround that subclasses 
 FacetConfig, like this:
 {code:title=CustomFacetConfig.java|borderStyle=solid}
 public class CustomFacetsConfig extends FacetsConfig
 {
   public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig();
   static
   {
   DEFAULT_D2A_DIM_CONFIG.multiValued = true;
   }
   @Override
   public synchronized DimConfig getDimConfig( String dimName )
   {
   DimConfig ft = super.getDimConfig( dimName );
   if ( DEFAULT_DIM_CONFIG.equals( ft ) )
   {
   return DEFAULT_D2A_DIM_CONFIG;
   }
   return ft;
   }
 }
 {code}
 I created a patch to illustrate what I would like to change. By making a 
 protected method it is easier to create a custom subclass of FacetConfig. 
 Also, maybe there are better way to accomplish my goal (easy default to 
 multivalue?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5479) Make default dimension config in FacetConfig adjustable

2014-02-27 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914474#comment-13914474
 ] 

Shai Erera commented on LUCENE-5479:


+1. Can you please document the method?

 Make default dimension config in FacetConfig adjustable 
 

 Key: LUCENE-5479
 URL: https://issues.apache.org/jira/browse/LUCENE-5479
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Rob Audenaerde
Priority: Minor
 Attachments: LUCENE-5479.patch


 Now it is hardcoded to DEFAULT_DIM_CONFIG. This may be useful for most 
 standard approaches. 
 However, I use lots of facets. These facets can be multivalued, I do not know 
 that on beforehand. So what I would like to do is to change the default 
 config to {{mulitvalued = true}}. 
 Currently I have a working, but rather ugly workaround that subclasses 
 FacetConfig, like this:
 {code:title=CustomFacetConfig.java|borderStyle=solid}
 public class CustomFacetsConfig extends FacetsConfig
 {
   public final static DimConfig DEFAULT_D2A_DIM_CONFIG = new DimConfig();
   static
   {
   DEFAULT_D2A_DIM_CONFIG.multiValued = true;
   }
   @Override
   public synchronized DimConfig getDimConfig( String dimName )
   {
   DimConfig ft = super.getDimConfig( dimName );
   if ( DEFAULT_DIM_CONFIG.equals( ft ) )
   {
   return DEFAULT_D2A_DIM_CONFIG;
   }
   return ft;
   }
 }
 {code}
 I created a patch to illustrate what I would like to change. By making a 
 protected method it is easier to create a custom subclass of FacetConfig. 
 Also, maybe there are better way to accomplish my goal (easy default to 
 multivalue?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5785) Turn down absurdly verbose test logging (megabytes)

Robert Muir created SOLR-5785:
-

 Summary: Turn down absurdly verbose test logging (megabytes)
 Key: SOLR-5785
 URL: https://issues.apache.org/jira/browse/SOLR-5785
 Project: Solr
  Issue Type: Bug
Reporter: Robert Muir


I wanted to look at a solr test failure to see if i could help fix it. 
unfortunately, it dumped 26MB of useless logging to the console.

This means i cannot even click on the jenkins console 
(https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/524/consoleText) to 
look at some stuff about the fail without totally crashing my browser.

This ridiculous amount of verbosity is preventing people from fixing tests, not 
helping. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #1115: POMs out of sync

2014-02-27 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/1115/

4 tests failed.
REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
Captured an uncaught exception in thread: Thread[id=95696, 
name=qtp1254086174-95696, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=95696, name=qtp1254086174-95696, state=RUNNABLE, 
group=TGRP-ChaosMonkeySafeLeaderTest]
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0)
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:693)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047)
at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312)
at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339)
at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323)
at 
org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:665)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:724)


REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
Captured an uncaught exception in thread: Thread[id=95704, 
name=qtp1254086174-95704, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=95704, name=qtp1254086174-95704, state=RUNNABLE, 
group=TGRP-ChaosMonkeySafeLeaderTest]
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0)
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:693)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047)
at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312)
at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339)
at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323)
at 
org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:665)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:724)


REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
Captured an uncaught exception in thread: Thread[id=95703, 
name=qtp1795561555-95703, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=95703, name=qtp1795561555-95703, state=RUNNABLE, 
group=TGRP-ChaosMonkeySafeLeaderTest]
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0)
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:693)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047)
at 
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1312)
at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1339)
at 
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1323)
at 
org.eclipse.jetty.server.ssl.SslSocketConnector$SslConnectorEndPoint.run(SslSocketConnector.java:665)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:724)


REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
Captured an uncaught exception in thread: Thread[id=95714, 
name=qtp473666576-95714, state=RUNNABLE, group=TGRP-ChaosMonkeySafeLeaderTest]

Stack Trace:
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught 
exception in thread: Thread[id=95714, name=qtp473666576-95714, state=RUNNABLE, 
group=TGRP-ChaosMonkeySafeLeaderTest]
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
at __randomizedtesting.SeedInfo.seed([501CDB0132A418E5]:0)
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:693)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1047)
at

[jira] [Created] (SOLR-5786) MapReduceIndexerTool --help text is missing large parts of the help text

wolfgang hoschek created SOLR-5786:
--

 Summary: MapReduceIndexerTool --help text is missing large parts 
of the help text
 Key: SOLR-5786
 URL: https://issues.apache.org/jira/browse/SOLR-5786
 Project: Solr
  Issue Type: Bug
  Components: contrib - MapReduce
Affects Versions: 4.7
Reporter: wolfgang hoschek
Assignee: Mark Miller
 Fix For: 4.8


As already mentioned repeatedly and at length, this is a regression introduced 
by the fix in https://issues.apache.org/jira/browse/SOLR-5605

Here is the diff of --help output before SOLR-5605 vs after SOLR-5605:

{code}
130,235c130
  lucene  segments  left  in   this  index.  Merging
  segments involves reading  and  rewriting all data
  in all these  segment  files, potentially multiple
  times,  which  is  very  I/O  intensive  and  time
  consuming. However, an  index  with fewer segments
  can later be merged  faster,  and  it can later be
  queried  faster  once  deployed  to  a  live  Solr
  serving shard. Set  maxSegments  to  1 to optimize
  the index for low query  latency. In a nutshell, a
  small maxSegments  value  trades  indexing latency
  for subsequently improved query  latency. This can
  be  a  reasonable  trade-off  for  batch  indexing
  systems. (default: 1)
   --fair-scheduler-pool STRING
  Optional tuning knob  that  indicates  the name of
  the fair scheduler  pool  to  submit  jobs to. The
  Fair Scheduler is a  pluggable MapReduce scheduler
  that provides a way to  share large clusters. Fair
  scheduling is a method  of  assigning resources to
  jobs such that all jobs  get, on average, an equal
  share of resources  over  time.  When  there  is a
  single job  running,  that  job  uses  the  entire
  cluster. When  other  jobs  are  submitted,  tasks
  slots that free up are  assigned  to the new jobs,
  so that each job gets  roughly  the same amount of
  CPU time.  Unlike  the  default  Hadoop scheduler,
  which forms a queue of  jobs, this lets short jobs
  finish in reasonable time  while not starving long
  jobs. It is also an  easy  way  to share a cluster
  between multiple of users.  Fair  sharing can also
  work with  job  priorities  -  the  priorities are
  used as  weights  to  determine  the  fraction  of
  total compute time that each job gets.
   --dry-run  Run in local mode  and  print  documents to stdout
  instead of loading them  into  Solr. This executes
  the  morphline  in  the  client  process  (without
  submitting a job  to  MR)  for  quicker turnaround
  during early  trialdebug  sessions. (default:
  false)
   --log4j FILE   Relative or absolute  path  to  a log4j.properties
  config file on the  local  file  system. This file
  will  be  uploaded  to   each  MR  task.  Example:
  /path/to/log4j.properties
   --verbose, -v  Turn on verbose output. (default: false)
   --show-non-solr-cloud  Also show options for  Non-SolrCloud  mode as part
  of --help. (default: false)
 
 Required arguments:
   --output-dir HDFS_URI  HDFS directory to  write  Solr  indexes to. Inside
  there one  output  directory  per  shard  will  be
  generated.Example: hdfs://c2202.mycompany.
  com/user/$USER/test
   --morphline-file FILE  Relative or absolute path  to  a local config file
  that contains one  or  more  morphlines.  The file
  must be  UTF-8  encoded.  Example:
  /path/to/morphline.conf
 
 Cluster arguments:
   Arguments that provide information about your Solr cluster. 
 
   --zk-host STRING   The address of a ZooKeeper  ensemble being used by
  a SolrCloud cluster. This  ZooKeeper ensemble will
  be examined  to  determine  the  number  of output

[jira] [Commented] (LUCENE-5475) add required attribute bugUrl to @BadApple

2014-02-27 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914546#comment-13914546
 ] 

Dawid Weiss commented on LUCENE-5475:
-

I've added dumping full annotation content (with attribute). Unfortunately 
there's no way to reference a snapshot build so you'll have to wait for a 
release (which I'll try to make in a day or two).


 add required attribute bugUrl to @BadApple
 --

 Key: LUCENE-5475
 URL: https://issues.apache.org/jira/browse/LUCENE-5475
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/test
Reporter: Robert Muir
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5475.patch


 This makes it impossible to tag a test as a badapple without a pointer to a 
 JIRA issue.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest


[ 
https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914549#comment-13914549
 ] 

wolfgang hoschek commented on SOLR-5605:


Correspondingly, I filed https://issues.apache.org/jira/browse/SOLR-5786

Look, as you know, I wrote almost all of the original solr-mapreduce contrib, 
and I know this code inside out. To be honest, this kind of repetitive 
ignorance is tiresome at best and completely turns me off.

 MapReduceIndexerTool fails in some locales -- seen in random failures of 
 MapReduceIndexerToolArgumentParserTest
 ---

 Key: SOLR-5605
 URL: https://issues.apache.org/jira/browse/SOLR-5605
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Mark Miller
 Fix For: 4.7, 5.0


 I noticed a randomized failure in MapReduceIndexerToolArgumentParserTest 
 which is reproducible with any seed -- all that matters is the locale.
 The problem sounded familiar, and a quick search verified that jenkins has in 
 fact hit this a couple of times in the past -- Uwe commented on the list that 
 this is due to a real problem in one of the third-party dependencies (that 
 does the argument parsing) that will affect usage on some systems.
 If working around the bug in the arg parsing lib isn't feasible, 
 MapReduceIndexerTool should fail cleanly if the locale isn't one we know is 
 supported



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5787) Get spellcheck frequency relatively to current query

2014-02-27 Thread Hakim (JIRA)

Hakim created SOLR-5787:
---

 Summary: Get spellcheck frequency relatively to current query
 Key: SOLR-5787
 URL: https://issues.apache.org/jira/browse/SOLR-5787
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Affects Versions: 4.6
 Environment: Solr deployed on Jetty 9 Servlet container
Reporter: Hakim
Priority: Minor


I guess that this functionnality isn't implemented yet. I'll begin by an 
example to explain what I'm requesting:

I have a lucene query that get articles satisfying a certain query. With this 
same command, I'm getting at the same time suggestions if this query doesnt 
return any article (so far, nothing unusual). 

The frequency (count) associated with these suggestions is relative to all 
index (it counts all occurences of the suggestion in the whole index). What I 
want is that it counts only suggestion occurences satisfying current lucene 
query.

P.S: I'm using solr's spellcheck component (solr.DirectSolrSpellChecker).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text

[
https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

wolfgang hoschek updated SOLR-5786:
---

Summary: MapReduceIndexerTool --help output is missing large parts of the
help text (was: MapReduceIndexerTool --help text is missing large parts of the
help text)

MapReduceIndexerTool --help output is missing large parts of the help text
--

Key: SOLR-5786
URL: https://issues.apache.org/jira/browse/SOLR-5786
Project: Solr
Issue Type: Bug
Components: contrib - MapReduce
Affects Versions: 4.7
Reporter: wolfgang hoschek
Assignee: Mark Miller
Fix For: 4.8

As already mentioned repeatedly and at length, this is a regression
introduced by the fix in https://issues.apache.org/jira/browse/SOLR-5605
Here is the diff of --help output before SOLR-5605 vs after SOLR-5605:
{code}
130,235c130
lucene segments left in this index. Merging
segments involves reading and rewriting all data
in all these segment files, potentially multiple
times, which is very I/O intensive and time
consuming. However, an index with fewer segments
can later be merged faster, and it can later be
queried faster once deployed to a live Solr
serving shard. Set maxSegments to 1 to optimize
the index for low query latency. In a nutshell, a
small maxSegments value trades indexing latency
for subsequently improved query latency. This can
be a reasonable trade-off for batch indexing
systems. (default: 1)
--fair-scheduler-pool STRING
Optional tuning knob that indicates the name of
the fair scheduler pool to submit jobs to. The
Fair Scheduler is a pluggable MapReduce scheduler
that provides a way to share large clusters. Fair
scheduling is a method of assigning resources to
jobs such that all jobs get, on average, an equal
share of resources over time. When there is a
single job running, that job uses the entire
cluster. When other jobs are submitted, tasks
slots that free up are assigned to the new jobs,
so that each job gets roughly the same amount of
CPU time. Unlike the default Hadoop scheduler,
which forms a queue of jobs, this lets short jobs
finish in reasonable time while not starving long
jobs. It is also an easy way to share a cluster
between multiple of users. Fair sharing can also
work with job priorities - the priorities are
used as weights to determine the fraction of
total compute time that each job gets.
--dry-run Run in local mode and print documents to stdout
instead of loading them into Solr. This executes
the morphline in the client process (without
submitting a job to MR) for quicker turnaround
during early trialdebug sessions. (default:
false)
--log4j FILE Relative or absolute path to a log4j.properties
config file on the local file system. This file
will be uploaded to each MR task. Example:
/path/to/log4j.properties
--verbose, -v Turn on verbose output. (default: false)
--show-non-solr-cloud Also show options for Non-SolrCloud mode as part
of --help. (default: false)

Required arguments:
--output-dir HDFS_URI HDFS directory to write Solr indexes to. Inside
there one output directory per shard will be
generated.Example: hdfs://c2202.mycompany.
com/user/$USER/test
--morphline-file FILE Relative or absolute path to a local config file
that contains one or more morphlines. The file
must be UTF-8

[jira] [Updated] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text

2014-02-27 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

wolfgang hoschek updated SOLR-5786:
---

Description:
As already mentioned repeatedly and at length, this is a regression introduced
by the fix in https://issues.apache.org/jira/browse/SOLR-5605

Here is the diff of --help output before SOLR-5605 vs after SOLR-5605:

{code}
130,235c130
lucene segments left in this index. Merging
segments involves reading and rewriting all data
in all these segment files, potentially multiple
times, which is very I/O intensive and time
consuming. However, an index with fewer segments
can later be merged faster, and it can later be
queried faster once deployed to a live Solr
serving shard. Set maxSegments to 1 to optimize
the index for low query latency. In a nutshell, a
small maxSegments value trades indexing latency
for subsequently improved query latency. This can
be a reasonable trade-off for batch indexing
systems. (default: 1)
--fair-scheduler-pool STRING
Optional tuning knob that indicates the name of
the fair scheduler pool to submit jobs to. The
Fair Scheduler is a pluggable MapReduce scheduler
that provides a way to share large clusters. Fair
scheduling is a method of assigning resources to
jobs such that all jobs get, on average, an equal
share of resources over time. When there is a
single job running, that job uses the entire
cluster. When other jobs are submitted, tasks
slots that free up are assigned to the new jobs,
so that each job gets roughly the same amount of
CPU time. Unlike the default Hadoop scheduler,
which forms a queue of jobs, this lets short jobs
finish in reasonable time while not starving long
jobs. It is also an easy way to share a cluster
between multiple of users. Fair sharing can also
work with job priorities - the priorities are
used as weights to determine the fraction of
total compute time that each job gets.
--dry-run Run in local mode and print documents to stdout
instead of loading them into Solr. This executes
the morphline in the client process (without
submitting a job to MR) for quicker turnaround
during early trialdebug sessions. (default:
false)
--log4j FILE Relative or absolute path to a log4j.properties
config file on the local file system. This file
will be uploaded to each MR task. Example:
/path/to/log4j.properties
--verbose, -v Turn on verbose output. (default: false)
--show-non-solr-cloud Also show options for Non-SolrCloud mode as part
of --help. (default: false)

Cluster arguments:
Arguments that provide information about your Solr cluster.

--zk-host STRING The address of a ZooKeeper ensemble being used by
a SolrCloud cluster. This ZooKeeper ensemble will
be examined to determine the number of output
shards to create as well as the Solr URLs to
merge the output shards into when using the --go-
live option. Requires that you also pass the --
collection to merge the shards into.

[jira] [Commented] (SOLR-5787) Get spellcheck frequency relatively to current query

2014-02-27 Thread James Dyer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914571#comment-13914571
 ] 

James Dyer commented on SOLR-5787:
--

Can you explain why spellcheck#maxCollationTries and 
spellcheck#collateExtendedResults do not satisify your needs?  This will give 
you the # of results that the query returns if you take all of the suggestions 
provided in the collation.

 Get spellcheck frequency relatively to current query
 

 Key: SOLR-5787
 URL: https://issues.apache.org/jira/browse/SOLR-5787
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Affects Versions: 4.6
 Environment: Solr deployed on Jetty 9 Servlet container
Reporter: Hakim
Priority: Minor
  Labels: features, newbie

 I guess that this functionnality isn't implemented yet. I'll begin by an 
 example to explain what I'm requesting:
 I have a lucene query that get articles satisfying a certain query. With this 
 same command, I'm getting at the same time suggestions if this query doesnt 
 return any article (so far, nothing unusual). 
 The frequency (count) associated with these suggestions is relative to all 
 index (it counts all occurences of the suggestion in the whole index). What I 
 want is that it counts only suggestion occurences satisfying current lucene 
 query.
 P.S: I'm using solr's spellcheck component (solr.DirectSolrSpellChecker).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life

2014-02-27 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914609#comment-13914609
 ] 

Erick Erickson commented on SOLR-5733:
--

Thanks Uwe! I finally bookmarked that page!

 Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of 
 their life
 ---

 Key: SOLR-5733
 URL: https://issues.apache.org/jira/browse/SOLR-5733
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6.1
 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 
 UTC 2013 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Marcus Engene
 Fix For: 4.6.1


 tien@solrssd2:/solr461stem/example$ cat start.sh
 #!/bin/sh
 java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 
 -jar start.jar 2/dev/null 1/dev/null
 Solr crashes spontaneously about every 2nd start within the first 10min of 
 the process life.
 tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data
 5405556   data
 Machine is not heavily used
 Tasks: 317 total,   1 running, 316 sleeping,   0 stopped,   0 zombie
 Cpu(s):  1.3%us,  0.0%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
 Mem:  264660644k total, 227656492k used, 37004152k free,   544848k buffers
 Swap:  4000144k total,   102940k used,  3897204k free, 204332940k cached
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND  
 
  7700 tien  20   0 32.4g 3.3g 1.2g S   13  1.3   2:23.15 java 
  
  8208 tien  20   0 27.6g 3.9g 805m S   10  1.5   0:56.45 java 
  
  7785 tien  20   0 26.7g 5.6g 2.2g S2  2.2   3:42.94 java 
  
  6102 tien  20   0 27.6g 9.9g 4.3g S0  3.9  61:03.26 java 
  
  8337 tien  20   0 19204 1552 1016 R0  0.0   0:00.02 top  
  
 1 root  20   0  8356  796  664 S0  0.0   0:12.90 init 
  
 2 root  20   0 000 S0  0.0   0:00.00 kthreadd 
  
 3 root  RT   0 000 S0  0.0   0:05.30 migration/0  
  
 4 root  20   0 000 S0  0.0   0:13.17 ksoftirqd/0  
  
 5 root  RT   0 000 S0  0.0   0:00.00 watchdog/0
 I'll try to attach the hs-dump.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5788) Document update in case of error doesn't return the error message correctly

2014-02-27 Thread Yago Riveiro (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yago Riveiro updated SOLR-5788:
---

Summary: Document update in case of error doesn't return the error message 
correctly  (was: Document update in case of error doesn't returns the error 
message correctly)

 Document update in case of error doesn't return the error message correctly
 ---

 Key: SOLR-5788
 URL: https://issues.apache.org/jira/browse/SOLR-5788
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro

 I found a issue when updating a document.
 If for any reason the update can't be done, example: the schema doesn't match 
 with the incoming doc; the error raise to the user is something like:
 {noformat}
 curl 'http://localhost:8983/solr/collection1/update?commit=true' 
 --data-binary @doc.json -H 'Content-type:application/json'
 {responseHeader:{status:400,QTime:52},error:{msg:Bad 
 Request\n\n\n\nrequest: 
 http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2Fwt=javabinversion=2,code:400}}
 {noformat}
 In case that the update was done on the leader, the error message is (IMHO) 
 the correct and with valuable info:
 {noformat}
 curl 'http://localhost:8983/solr/collection1/update?commit=true' 
 --data-binary @doc.json -H 'Content-type:application/json'
 {responseHeader:{status:400,QTime:19},error:{msg:ERROR: 
 [doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input 
 string: \Direct\,code:400}}
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5788) Document update in case of error doesn't returns the error message correctly

2014-02-27 Thread Yago Riveiro (JIRA)

Yago Riveiro created SOLR-5788:
--

 Summary: Document update in case of error doesn't returns the 
error message correctly
 Key: SOLR-5788
 URL: https://issues.apache.org/jira/browse/SOLR-5788
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.6.1
Reporter: Yago Riveiro


I found a issue when updating a document.

If for any reason the update can't be done, example: the schema doesn't match 
with the incoming doc; the error raise to the user is something like:

{noformat}
curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary 
@doc.json -H 'Content-type:application/json'
{responseHeader:{status:400,QTime:52},error:{msg:Bad 
Request\n\n\n\nrequest: 
http://localhost:8983/solr/collection1_shard3_replica1/update?update.distrib=TOLEADERdistrib.from=http%3A%2F%2Flocalhost%3A8983%2Fsolr%2Fcollection1_shard1_replica2%2Fwt=javabinversion=2,code:400}}
{noformat}

In case that the update was done on the leader, the error message is (IMHO) the 
correct and with valuable info:

{noformat}
curl 'http://localhost:8983/solr/collection1/update?commit=true' --data-binary 
@doc.json -H 'Content-type:application/json'
{responseHeader:{status:400,QTime:19},error:{msg:ERROR: 
[doc=01!12967564] Error adding field 'source'='[Direct]' msg=For input string: 
\Direct\,code:400}}
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries


[ 
https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914631#comment-13914631
 ] 

ASF subversion and git services commented on LUCENE-5478:
-

Commit 1572613 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1572613 ]

LUCENE-5478: CommonTermsQuery now allows to create custom term queries

 Allow CommonTermsQuery to create custom term queries
 

 Key: LUCENE-5478
 URL: https://issues.apache.org/jira/browse/LUCENE-5478
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.7
Reporter: Simon Willnauer
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5478.patch


 currently we create term queries with _new TermQuery(..)_ directly in 
 _CommonTermsQuery_ I'd like to extend the creation of the term query just 
 like you can do that in the the query parser.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914652#comment-13914652
 ] 

ASF subversion and git services commented on LUCENE-5478:
-

Commit 1572624 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1572624 ]

LUCENE-5478: CommonTermsQuery now allows to create custom term queries

 Allow CommonTermsQuery to create custom term queries
 

 Key: LUCENE-5478
 URL: https://issues.apache.org/jira/browse/LUCENE-5478
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.7
Reporter: Simon Willnauer
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5478.patch


 currently we create term queries with _new TermQuery(..)_ directly in 
 _CommonTermsQuery_ I'd like to extend the creation of the term query just 
 like you can do that in the the query parser.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-5478) Allow CommonTermsQuery to create custom term queries

2014-02-27 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer closed LUCENE-5478.
---

Resolution: Fixed

 Allow CommonTermsQuery to create custom term queries
 

 Key: LUCENE-5478
 URL: https://issues.apache.org/jira/browse/LUCENE-5478
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.7
Reporter: Simon Willnauer
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5478.patch


 currently we create term queries with _new TermQuery(..)_ directly in 
 _CommonTermsQuery_ I'd like to extend the creation of the term query just 
 like you can do that in the the query parser.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5376) Add a demo search server

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914680#comment-13914680
 ] 

ASF subversion and git services commented on LUCENE-5376:
-

Commit 1572637 from [~mikemccand] in branch 'dev/branches/lucene5376'
[ https://svn.apache.org/r1572637 ]

LUCENE-5376: add factory for SuggestStopFilter; get PostingsHighlighter MTQ 
highlighting working with block join queries; fix 0.0 score from block join 
group parent; add explicit label faceting; fix analyzing infix suggester 
highlighting; allow drill-downs on range facets

 Add a demo search server
 

 Key: LUCENE-5376
 URL: https://issues.apache.org/jira/browse/LUCENE-5376
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: lucene-demo-server.tgz


 I think it'd be useful to have a demo search server for Lucene.
 Rather than being fully featured, like Solr, it would be minimal, just 
 wrapping the existing Lucene modules to show how you can make use of these 
 features in a server setting.
 The purpose is to demonstrate how one can build a minimal search server on 
 top of APIs like SearchManager, SearcherLifetimeManager, etc.
 This is also useful for finding rough edges / issues in Lucene's APIs that 
 make building a server unnecessarily hard.
 I don't think it should have back compatibility promises (except Lucene's 
 index back compatibility), so it's free to improve as Lucene's APIs change.
 As a starting point, I'll post what I built for the eating your own dog 
 food search app for Lucene's  Solr's jira issues 
 http://jirasearch.mikemccandless.com (blog: 
 http://blog.mikemccandless.com/2013/05/eating-dog-food-with-lucene.html ). It 
 uses Netty to expose basic indexing  searching APIs via JSON, but it's very 
 rough (lots nocommits).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest

[
https://issues.apache.org/jira/browse/SOLR-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914682#comment-13914682
]

Mark Miller commented on SOLR-5605:
---

A few points:

* Are you not a committer? At Apache, those who do decide.

* I did not realize Patricks patch did not include the latest code updates from
MapReduce. You were not clear that you had looked at the latest code or the
latest build. You have not contributed any real effort to the upstream work,
therefore I don't have a lot of trust in your knowledge of the upstream work.

* I had and still have bigger concerns around the usability of this code in
Solr than this issue. It is very, very far from easy for someone to get started
with this contrib right now. Which is why the contrib is marked experimental,
which is why non of these smaller issues concern me very much at this point.

MapReduceIndexerTool fails in some locales -- seen in random failures of
MapReduceIndexerToolArgumentParserTest
---

Key: SOLR-5605
URL: https://issues.apache.org/jira/browse/SOLR-5605
Project: Solr
Issue Type: Bug
Reporter: Hoss Man
Assignee: Mark Miller
Fix For: 4.7, 5.0

I noticed a randomized failure in MapReduceIndexerToolArgumentParserTest
which is reproducible with any seed -- all that matters is the locale.
The problem sounded familiar, and a quick search verified that jenkins has in
fact hit this a couple of times in the past -- Uwe commented on the list that
this is due to a real problem in one of the third-party dependencies (that
does the argument parsing) that will affect usage on some systems.
If working around the bug in the arg parsing lib isn't feasible,
MapReduceIndexerTool should fail cleanly if the locale isn't one we know is
supported

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5786) MapReduceIndexerTool --help output is missing large parts of the help text

[
https://issues.apache.org/jira/browse/SOLR-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mark Miller resolved SOLR-5786.
---

Resolution: Duplicate

MapReduceIndexerTool --help output is missing large parts of the help text
--

Cluster arguments:
Arguments that provide information about your

[jira] [Commented] (SOLR-5733) Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of their life

2014-02-27 Thread Marcus Engene (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914684#comment-13914684
 ] 

Marcus Engene commented on SOLR-5733:
-

Thanks, I'll try Oracle's too.

Sorry, I thought I did close the ticket? I waited until I had some conclusions 
after testing, which perhaps was bad.

 Solr 4.5.0, 4.5.1, and 4.6.1 spontaneously chashes within first 10min of 
 their life
 ---

 Key: SOLR-5733
 URL: https://issues.apache.org/jira/browse/SOLR-5733
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5, 4.5.1, 4.6.1
 Environment: Linux solrssd2 2.6.32-5-amd64 #1 SMP Fri May 10 08:43:19 
 UTC 2013 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.13) (6b18-1.8.13-0+squeeze2)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Marcus Engene
 Fix For: 4.6.1


 tien@solrssd2:/solr461stem/example$ cat start.sh
 #!/bin/sh
 java -Xms9G -Xmx22G -Djetty.host=0.0.0.0 -Djetty.port=9993 -DhostPort=9993 
 -jar start.jar 2/dev/null 1/dev/null
 Solr crashes spontaneously about every 2nd start within the first 10min of 
 the process life.
 tien@solrssd2:/solr461stem/example/solr/collection1$ du -ks data
 5405556   data
 Machine is not heavily used
 Tasks: 317 total,   1 running, 316 sleeping,   0 stopped,   0 zombie
 Cpu(s):  1.3%us,  0.0%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
 Mem:  264660644k total, 227656492k used, 37004152k free,   544848k buffers
 Swap:  4000144k total,   102940k used,  3897204k free, 204332940k cached
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND  
 
  7700 tien  20   0 32.4g 3.3g 1.2g S   13  1.3   2:23.15 java 
  
  8208 tien  20   0 27.6g 3.9g 805m S   10  1.5   0:56.45 java 
  
  7785 tien  20   0 26.7g 5.6g 2.2g S2  2.2   3:42.94 java 
  
  6102 tien  20   0 27.6g 9.9g 4.3g S0  3.9  61:03.26 java 
  
  8337 tien  20   0 19204 1552 1016 R0  0.0   0:00.02 top  
  
 1 root  20   0  8356  796  664 S0  0.0   0:12.90 init 
  
 2 root  20   0 000 S0  0.0   0:00.00 kthreadd 
  
 3 root  RT   0 000 S0  0.0   0:05.30 migration/0  
  
 4 root  20   0 000 S0  0.0   0:13.17 ksoftirqd/0  
  
 5 root  RT   0 000 S0  0.0   0:00.00 watchdog/0
 I'll try to attach the hs-dump.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914691#comment-13914691
 ] 

Mark Miller commented on SOLR-5781:
---

I was initially thinking cluster wide.

 Make the Collections API timeout configurable.
 --

 Key: SOLR-5781
 URL: https://issues.apache.org/jira/browse/SOLR-5781
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
 Fix For: 4.8, 5.0


 This would also help with tests - nightlies can be quite intensive and need a 
 very high timeout.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5789) Add min/max modifiers to Atomic Updates

2014-02-27 Thread JIRA

Nim Lhûg created SOLR-5789:
--

 Summary: Add min/max modifiers to Atomic Updates
 Key: SOLR-5789
 URL: https://issues.apache.org/jira/browse/SOLR-5789
 Project: Solr
  Issue Type: New Feature
Reporter: Nim Lhûg


The Atomic Updates feature currently suppors add/inc/set. A min  max modifier 
would allow for conditional updates: update if new value is smaller/greater 
than the current value. This is much more convenient than fetching the 
document, comparing the values and then sending an update.

The patch seems to work, but probably requires more testing. 

Note: will add a link to the pull request in a minute.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] lucene-solr pull request: SOLR-5789 Add min/max modifiers to Atomi...

2014-02-27 Thread codematters

GitHub user codematters opened a pull request:

https://github.com/apache/lucene-solr/pull/39

SOLR-5789 Add min/max modifiers to Atomic Updates

Allows for conditional atomic updates -- if new value is smaller or
larger than the old value.

Jira: https://issues.apache.org/jira/browse/SOLR-5789

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/INTIXnv/lucene-solr trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/39.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #39


commit 901a810c98eae381862f79f7cf4f2c10bffb8730
Author: Bram gitb...@codematters.be
Date:   2014-02-27T16:17:45Z

SOLR-5789 Add min/max modifiers to Atomic Updates

Allows for conditional atomic updates -- if new value is smaller or
larger than the old value.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5789) Add min/max modifiers to Atomic Updates

2014-02-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914697#comment-13914697
 ] 

ASF GitHub Bot commented on SOLR-5789:
--

GitHub user codematters opened a pull request:

https://github.com/apache/lucene-solr/pull/39

SOLR-5789 Add min/max modifiers to Atomic Updates

Allows for conditional atomic updates -- if new value is smaller or
larger than the old value.

Jira: https://issues.apache.org/jira/browse/SOLR-5789

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/INTIXnv/lucene-solr trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/39.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #39


commit 901a810c98eae381862f79f7cf4f2c10bffb8730
Author: Bram gitb...@codematters.be
Date:   2014-02-27T16:17:45Z

SOLR-5789 Add min/max modifiers to Atomic Updates

Allows for conditional atomic updates -- if new value is smaller or
larger than the old value.




 Add min/max modifiers to Atomic Updates
 ---

 Key: SOLR-5789
 URL: https://issues.apache.org/jira/browse/SOLR-5789
 Project: Solr
  Issue Type: New Feature
Reporter: Nim Lhûg

 The Atomic Updates feature currently suppors add/inc/set. A min  max 
 modifier would allow for conditional updates: update if new value is 
 smaller/greater than the current value. This is much more convenient than 
 fetching the document, comparing the values and then sending an update.
 The patch seems to work, but probably requires more testing. 
 Note: will add a link to the pull request in a minute.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914696#comment-13914696
 ] 

ASF subversion and git services commented on LUCENE-5468:
-

Commit 1572643 from [~rcmuir] in branch 'dev/branches/lucene5468'
[ https://svn.apache.org/r1572643 ]

LUCENE-5468: don't create unnecessary objects

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5789) Add min/max modifiers to Atomic Updates

2014-02-27 Thread JIRA

[
https://issues.apache.org/jira/browse/SOLR-5789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nim Lhûg updated SOLR-5789:
---

Description:
The Atomic Updates feature currently suppors add/inc/set. A min max modifier
would allow for conditional updates: update if new value is smaller/greater
than the current value. This is much more convenient than fetching the
document, comparing the values and then sending an update.

The patch seems to work, but probably requires more testing.

Pull request: https://github.com/apache/lucene-solr/pull/39

was:
The Atomic Updates feature currently suppors add/inc/set. A min max modifier
would allow for conditional updates: update if new value is smaller/greater
than the current value. This is much more convenient than fetching the
document, comparing the values and then sending an update.

The patch seems to work, but probably requires more testing.

Note: will add a link to the pull request in a minute.

Add min/max modifiers to Atomic Updates
---

Key: SOLR-5789
URL: https://issues.apache.org/jira/browse/SOLR-5789
Project: Solr
Issue Type: New Feature
Reporter: Nim Lhûg

The Atomic Updates feature currently suppors add/inc/set. A min max
modifier would allow for conditional updates: update if new value is
smaller/greater than the current value. This is much more convenient than
fetching the document, comparing the values and then sending an update.
The patch seems to work, but probably requires more testing.
Pull request: https://github.com/apache/lucene-solr/pull/39

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 1913 - Still Failing

2014-02-27 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/1913/

All tests passed

Build Log:
[...truncated 28417 lines...]
check-licenses:
 [echo] License check under: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr
 [licenses] MISSING sha1 checksum file for: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/jcl-over-slf4j-1.6.6.jar
 [licenses] EXPECTED sha1 checksum file : 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/jcl-over-slf4j-1.6.6.jar.sha1

 [licenses] MISSING sha1 checksum file for: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/jul-to-slf4j-1.6.6.jar
 [licenses] EXPECTED sha1 checksum file : 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/jul-to-slf4j-1.6.6.jar.sha1
 [licenses] MISSING sha1 checksum file for: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/slf4j-api-1.6.6.jar
 [licenses] EXPECTED sha1 checksum file : 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/slf4j-api-1.6.6.jar.sha1

[...truncated 3 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:471:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:64:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/build.xml:254:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/lucene/tools/custom-tasks.xml:62:
 License check failed. Check the logs.

Total time: 108 minutes 19 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [GitHub] lucene-solr pull request: SOLR-5789 Add min/max modifiers to Atomi...

2014-02-27 Thread Bram Van Dam


GitHub user codematters opened a pull request:
 https://github.com/apache/lucene-solr/pull/39


Discussion and tips for improvements are welcome!

Thanks




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5790) SolrException: Unknown document router '{name=compositeId}'.

2014-02-27 Thread JIRA

Günther Ruck created SOLR-5790:
--

 Summary: SolrException: Unknown document router 
'{name=compositeId}'.
 Key: SOLR-5790
 URL: https://issues.apache.org/jira/browse/SOLR-5790
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
 Environment: Windows 7 64 Bit
Reporter: Günther Ruck
Priority: Minor


I tried to use the CloudServerClass of the SolrJ-Api. SolrJ and Solr-Server 
both in version 4.6.1.
{{serverCloud =  new CloudSolrServer(zkHost);}}
My JUnit starts with a deleteByQuery. In DocRouter.java:46 a SolrException is 
thrown because 
{{routerMap.get(routerSpec);}}
finds no entry.
_Hints:_
routerSpec is an instance of LinkedHashMapK,V with one entry (key:name, 
value:compositeId).

routerMap is a HashMapK,V holding 4 entries, especially key:compositeId has 
value:  org.apache.solr.common.cloud.CompositeIdRouter.

Probably there is a type mismatch at the routerMap.get call.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.

2014-02-27 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914716#comment-13914716
 ] 

Noble Paul commented on SOLR-5781:
--

At least it should be overridable on a per call basis

 Make the Collections API timeout configurable.
 --

 Key: SOLR-5781
 URL: https://issues.apache.org/jira/browse/SOLR-5781
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
 Fix For: 4.8, 5.0


 This would also help with tests - nightlies can be quite intensive and need a 
 very high timeout.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5438) add near-real-time replication

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914721#comment-13914721
 ] 

ASF subversion and git services commented on LUCENE-5438:
-

Commit 1572653 from [~mikemccand] in branch 'dev/branches/lucene5438'
[ https://svn.apache.org/r1572653 ]

LUCENE-5438: commit current [broken] state

 add near-real-time replication
 --

 Key: LUCENE-5438
 URL: https://issues.apache.org/jira/browse/LUCENE-5438
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/replicator
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.7, 5.0

 Attachments: LUCENE-5438.patch, LUCENE-5438.patch


 Lucene's replication module makes it easy to incrementally sync index
 changes from a master index to any number of replicas, and it
 handles/abstracts all the underlying complexity of holding a
 time-expiring snapshot, finding which files need copying, syncing more
 than one index (e.g., taxo + index), etc.
 But today you must first commit on the master, and then again the
 replica's copied files are fsync'd, because the code operates on
 commit points.  But this isn't technically necessary, and it mixes
 up durability and fast turnaround time.
 Long ago we added near-real-time readers to Lucene, for the same
 reason: you shouldn't have to commit just to see the new index
 changes.
 I think we should do the same for replication: allow the new segments
 to be copied out to replica(s), and new NRT readers to be opened, to
 fully decouple committing from visibility.  This way apps can then
 separately choose when to replicate (for freshness), and when to
 commit (for durability).
 I think for some apps this could be a compelling alternative to the
 re-index all documents on each shard approach that Solr Cloud /
 ElasticSearch implement today, and it may also mean that the
 transaction log can remain external to / above the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5438) add near-real-time replication


[ 
https://issues.apache.org/jira/browse/LUCENE-5438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914724#comment-13914724
 ] 

Michael McCandless commented on LUCENE-5438:


I've committed my current [broken] state here, but I'm gonna moth ball this for 
now.

I had made the test case more evil, by adding randomly shutting down a master 
and moving it to another node (promoting a replica to master).  It turns out 
this is very hard to do properly, because in this case, file names can be 
re-used (Lucene is no longer write-once) and detecting that is tricky, unless 
we can rely on some external global reliable storage (e.g. something stored in 
Zookeeper maybe) to record the last segments gen / segment name that was 
written on any node ...

 add near-real-time replication
 --

 Key: LUCENE-5438
 URL: https://issues.apache.org/jira/browse/LUCENE-5438
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/replicator
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.7, 5.0

 Attachments: LUCENE-5438.patch, LUCENE-5438.patch


 Lucene's replication module makes it easy to incrementally sync index
 changes from a master index to any number of replicas, and it
 handles/abstracts all the underlying complexity of holding a
 time-expiring snapshot, finding which files need copying, syncing more
 than one index (e.g., taxo + index), etc.
 But today you must first commit on the master, and then again the
 replica's copied files are fsync'd, because the code operates on
 commit points.  But this isn't technically necessary, and it mixes
 up durability and fast turnaround time.
 Long ago we added near-real-time readers to Lucene, for the same
 reason: you shouldn't have to commit just to see the new index
 changes.
 I think we should do the same for replication: allow the new segments
 to be copied out to replica(s), and new NRT readers to be opened, to
 fully decouple committing from visibility.  This way apps can then
 separately choose when to replicate (for freshness), and when to
 commit (for durability).
 I think for some apps this could be a compelling alternative to the
 re-index all documents on each shard approach that Solr Cloud /
 ElasticSearch implement today, and it may also mean that the
 transaction log can remain external to / above the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5781) Make the Collections API timeout configurable.

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914727#comment-13914727
 ] 

Mark Miller commented on SOLR-5781:
---

That doesn't concern me much for this issue - my motivation is for easy 
adjustment for tests - it's just kind of a side affect that it will also 
benefit users. If someone wants to make it available per call for users as 
well, that's fine with me. Though it's not likely they are going to know how 
they should set it depending on the call they are making. Someone might argue 
that it's almost just adding confusion to the API more than it helps really. I 
wouldn't argue though.

 Make the Collections API timeout configurable.
 --

 Key: SOLR-5781
 URL: https://issues.apache.org/jira/browse/SOLR-5781
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Mark Miller
 Fix For: 4.8, 5.0


 This would also help with tests - nightlies can be quite intensive and need a 
 very high timeout.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5471) Classloader issues when running Lucene under a java SecurityManager

2014-02-27 Thread Rick Hillegas (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914731#comment-13914731
 ] 

Rick Hillegas commented on LUCENE-5471:
---

Thanks for the help and the discussion so far, Hoss and Uwe.

Attaching a second rev of the SecureLucene test program. This version pares 
back the permissions in order to expose the minimal attack surface which I can 
configure by myself. Here are the minimal permissions which the test program 
grants in order to run successfully under a Java Security Manager:

{noformat}
// permissions granted to Lucene
grant codeBase 
file:/Users/rh161140/derby/derby-590/trunk/tools/java/lucene-core-4.5.0.jar
{
  // permissions for file access, write access only to sandbox:
  permission java.io.FilePermission ALL FILES, read;
  permission java.io.FilePermission 
/Users/rh161140/derby/derby-590/luceneTest, read,write,delete;
  permission java.io.FilePermission 
/Users/rh161140/derby/derby-590/luceneTest/-, read,write,delete;
  
  // Basic permissions needed for Lucene to work:
  permission java.util.PropertyPermission user.dir, read;
  permission java.util.PropertyPermission sun.arch.data.model, read;
  permission java.lang.reflect.ReflectPermission *;
  permission java.lang.RuntimePermission *;
};

// permissions granted to the application
grant codeBase file:/Users/rh161140/src/
{
  // permissions for file access, write access only to sandbox:
  permission java.io.FilePermission ALL FILES, read;
  permission java.io.FilePermission 
/Users/rh161140/derby/derby-590/luceneTest, read,write;
  permission java.io.FilePermission 
/Users/rh161140/derby/derby-590/luceneTest/-, read,write,delete;
  
  // Basic permissions needed for Lucene to work:
  permission java.util.PropertyPermission user.dir, read;
  permission java.util.PropertyPermission sun.arch.data.model, read;
};
{noformat}

I have some follow on comments and questions:

1) Is it really necessary to grant Lucene every RuntimePermission and the 
privilege to read every file in the file system? Maybe these grants can be 
tightened.

2) I don't understand why the calling, application code needs to be granted any 
permissions. Maybe some more privilege blocks could be added to the Lucene 
code? In particular, it seems a shame that the application has to be granted 
the privilege to read every file in the file system.

3) Most of the application permissions are self-revealing. That is, if I omit 
one of them, then I get an exception telling me that the permission needs to be 
granted. However, that is not the case for the first permission granted to the 
application...

  permission java.io.FilePermission ALL FILES, read;

...Without that permission, I get the original puzzling exception: Caused by: 
java.lang.IllegalArgumentException: A SPI class of type 
org.apache.lucene.codecs.Codec..., which doesn't really tell me what the 
problem is. Maybe the wording of that exception could be improved so that the 
user can be told that one of its root causes is a failure to grant the 
application and Lucene read access to every file in the file system.

Thanks,
-Rick


 Classloader issues when running Lucene under a java SecurityManager
 ---

 Key: LUCENE-5471
 URL: https://issues.apache.org/jira/browse/LUCENE-5471
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Rick Hillegas
 Attachments: SecureLucene.java


 I see the following error when I run Lucene 4.5.0 under a java 
 SecurityManager. I will attach a test program which shows this problem. The 
 program works fine when a SecurityManager is not installed. But the program 
 fails when I install a SecurityManager. Even more puzzling, the program works 
 if I first run it without a SecurityManager, then install a SecurityManager, 
 then re-run the program, all within the lifetime of a single JVM. I would 
 appreciate advice about how to work around this problem:
 Exception in thread main java.lang.ExceptionInInitializerError
   at 
 org.apache.lucene.index.LiveIndexWriterConfig.init(LiveIndexWriterConfig.java:122)
   at 
 org.apache.lucene.index.IndexWriterConfig.init(IndexWriterConfig.java:165)
   at SecureLucene$1.run(SecureLucene.java:129)
   at SecureLucene$1.run(SecureLucene.java:122)
   at java.security.AccessController.doPrivileged(Native Method)
   at SecureLucene.getIndexWriter(SecureLucene.java:120)
   at SecureLucene.runTest(SecureLucene.java:72)
   at SecureLucene.main(SecureLucene.java:52)
 Caused by: java.lang.IllegalArgumentException: A SPI class of type 
 org.apache.lucene.codecs.Codec with name 'Lucene45' does not exist. You need 
 to add the corresponding JAR file supporting this SPI to your classpath.The 
 current classpath supports the following names: []

[jira] [Updated] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide

2014-02-27 Thread Cassandra Targett (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-5753:


Component/s: documentation

 eliminate blue Topics covered in this section box from ref guide
 --

 Key: SOLR-5753
 URL: https://issues.apache.org/jira/browse/SOLR-5753
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Hoss Man
Assignee: Cassandra Targett
 Fix For: 4.8


 a bunch of pages in the ref guide have a blue box at the top right of the 
 page that says Topics covered in this section and has links down to the 
 major anchors on the page...
 https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername=
 ...this blue box looks great on the webpage, but doesn't look very good in 
 the exported PDF.
 we should consider eliminating it, or reformatting it, or replacing it with 
 something that makes more sense in the context of the PDF



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: VOTE: Release apache-solr-ref-guide-4.7.pdf (RC1)

2014-02-27 Thread Cassandra Targett

On Thu, Feb 27, 2014 at 4:11 AM, Steve Rowe sar...@gmail.com wrote:


 I noticed that a couple of Topics covered in this section boxes are too
 narrow to allow their content to be legible - the ones on pages 251 and 300
 look really bad, and some others are only marginally legible.


I finally found a fix for this problem (which is a more egregious example
of the same issue Hoss filed as SOLR-5753) on this Atlassian issue:
https://jira.atlassian.com/browse/CONF-14758. However it states that it's
for Confluence 5.3+, so I don't know if it will work with 5.0.3 which is
the version CWIKI is on.

Maybe worth a try? I've posted the possible solution to SOLR-5753 so maybe
if you put it in the PDF CSS we can see how it works.

[jira] [Commented] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide

2014-02-27 Thread Cassandra Targett (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914737#comment-13914737
 ] 

Cassandra Targett commented on SOLR-5753:
-

According to this issue: https://jira.atlassian.com/browse/CONF-14758 (in the 
description), a fix for this problem should be to add the below to the CSS for 
the PDF export. However, the same issue says it's for Confluence 5.3+, so may 
not work with the current CWIKI version, but is maybe worth a try for 4.7?

{code}
.sectionMacro .columnMacro {  
border: none;  
padding: 0;  
}  
  
.columnMacro {  
display: table-cell;  
vertical-align: top;  
} 
{code}

 eliminate blue Topics covered in this section box from ref guide
 --

 Key: SOLR-5753
 URL: https://issues.apache.org/jira/browse/SOLR-5753
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Hoss Man
Assignee: Cassandra Targett
 Fix For: 4.8


 a bunch of pages in the ref guide have a blue box at the top right of the 
 page that says Topics covered in this section and has links down to the 
 major anchors on the page...
 https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername=
 ...this blue box looks great on the webpage, but doesn't look very good in 
 the exported PDF.
 we should consider eliminating it, or reformatting it, or replacing it with 
 something that makes more sense in the context of the PDF



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914755#comment-13914755
 ] 

ASF subversion and git services commented on LUCENE-5468:
-

Commit 1572660 from [~rcmuir] in branch 'dev/branches/lucene5468'
[ https://svn.apache.org/r1572660 ]

LUCENE-5468: encode affix data as 8 bytes per affix, before cutting over to FST

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide


 [ 
https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-5753:
-

Attachment: solr-ResultClustering-270214-1714-9084.pdf

Cassandra, I added the CSS snippet you quoted above to the PDF export 
stylesheet, and the result is attached for the Result Clustering page - it 
definitely looks better to me. 

 eliminate blue Topics covered in this section box from ref guide
 --

 Key: SOLR-5753
 URL: https://issues.apache.org/jira/browse/SOLR-5753
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Hoss Man
Assignee: Cassandra Targett
 Fix For: 4.8

 Attachments: solr-ResultClustering-270214-1714-9084.pdf


 a bunch of pages in the ref guide have a blue box at the top right of the 
 page that says Topics covered in this section and has links down to the 
 major anchors on the page...
 https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername=
 ...this blue box looks great on the webpage, but doesn't look very good in 
 the exported PDF.
 we should consider eliminating it, or reformatting it, or replacing it with 
 something that makes more sense in the context of the PDF



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide


[ 
https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914758#comment-13914758
 ] 

Steve Rowe commented on SOLR-5753:
--

I noticed that the attached Result Clustering page says Topics covered on this 
page, which is confusing in the PDF - probably should change this (and others 
like it) to say section instead of page 

 eliminate blue Topics covered in this section box from ref guide
 --

 Key: SOLR-5753
 URL: https://issues.apache.org/jira/browse/SOLR-5753
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Hoss Man
Assignee: Cassandra Targett
 Fix For: 4.8

 Attachments: solr-ResultClustering-270214-1714-9084.pdf


 a bunch of pages in the ref guide have a blue box at the top right of the 
 page that says Topics covered in this section and has links down to the 
 major anchors on the page...
 https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername=
 ...this blue box looks great on the webpage, but doesn't look very good in 
 the exported PDF.
 we should consider eliminating it, or reformatting it, or replacing it with 
 something that makes more sense in the context of the PDF



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-5616) Make grouping code use response builder needDocList

2014-02-27 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-5616:


Assignee: Erick Erickson

 Make grouping code use response builder needDocList
 ---

 Key: SOLR-5616
 URL: https://issues.apache.org/jira/browse/SOLR-5616
 Project: Solr
  Issue Type: Bug
Reporter: Steven Bower
Assignee: Erick Erickson
 Attachments: SOLR-5616.patch


 Right now the grouping code does this to check if it needs to generate a 
 docList for grouped results:
 {code}
 if (rb.doHighlights || rb.isDebug() || params.getBool(MoreLikeThisParams.MLT, 
 false) ){
 ...
 }
 {code}
 this is ugly because any new component that needs a doclist, from grouped 
 results, will need to modify QueryComponent to add a check to this if. 
 Ideally this should just use the rb.isNeedDocList() flag...
 Coincidentally this boolean is really never used at for non-grouped results 
 it always gets generated..



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide


 [ 
https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-5753:
-

Attachment: 
solr-RequestHandlersandSearchComponentsinSolrConfig-270214-1724-9098.pdf

Attaching the Request Handlers and Search Components in SolrConfig PDF export 
using the modified PDF export CSS - this one is way better, almost looks good!  
This is the one from page 300 in the Ref Guide RC1 PDF.

 eliminate blue Topics covered in this section box from ref guide
 --

 Key: SOLR-5753
 URL: https://issues.apache.org/jira/browse/SOLR-5753
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Hoss Man
Assignee: Cassandra Targett
 Fix For: 4.8

 Attachments: 
 solr-RequestHandlersandSearchComponentsinSolrConfig-270214-1724-9098.pdf, 
 solr-ResultClustering-270214-1714-9084.pdf


 a bunch of pages in the ref guide have a blue box at the top right of the 
 page that says Topics covered in this section and has links down to the 
 major anchors on the page...
 https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername=
 ...this blue box looks great on the webpage, but doesn't look very good in 
 the exported PDF.
 we should consider eliminating it, or reformatting it, or replacing it with 
 something that makes more sense in the context of the PDF



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5758) need ref guide doc on building indexes with mapreduce (morphlines-cell contrib)


[ 
https://issues.apache.org/jira/browse/SOLR-5758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914760#comment-13914760
 ] 

Mark Miller commented on SOLR-5758:
---

That output is affect by SOLR-5782 - I'll make another dump shortly.

 need ref guide doc on building indexes with mapreduce (morphlines-cell 
 contrib)
 ---

 Key: SOLR-5758
 URL: https://issues.apache.org/jira/browse/SOLR-5758
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Hoss Man
Assignee: Mark Miller
 Fix For: 4.8


 This is marked experimental for 4.7, but we should have a section on it in 
 the ref guide in 4.8



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5753) eliminate blue Topics covered in this section box from ref guide

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914754#comment-13914754
 ] 

Steve Rowe edited comment on SOLR-5753 at 2/27/14 5:20 PM:
---

Cassandra, I added the CSS snippet you quoted above to the PDF export 
stylesheet, and the result is attached for the Result Clustering page 
(corresponding to page 251 in the Solr Ref Guide RC1 PDF) - it definitely looks 
better to me. 


was (Author: steve_rowe):
Cassandra, I added the CSS snippet you quoted above to the PDF export 
stylesheet, and the result is attached for the Result Clustering page - it 
definitely looks better to me. 

 eliminate blue Topics covered in this section box from ref guide
 --

 Key: SOLR-5753
 URL: https://issues.apache.org/jira/browse/SOLR-5753
 Project: Solr
  Issue Type: Task
  Components: documentation
Reporter: Hoss Man
Assignee: Cassandra Targett
 Fix For: 4.8

 Attachments: solr-ResultClustering-270214-1714-9084.pdf


 a bunch of pages in the ref guide have a blue box at the top right of the 
 page that says Topics covered in this section and has links down to the 
 major anchors on the page...
 https://cwiki.apache.org/confluence/dosearchsite.action?queryString=%22Topics+covered+in+this+section%22where=solrtype=lastModified=contributor=contributorUsername=
 ...this blue box looks great on the webpage, but doesn't look very good in 
 the exported PDF.
 we should consider eliminating it, or reformatting it, or replacing it with 
 something that makes more sense in the context of the PDF



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5787) Get spellcheck frequency relatively to current query

2014-02-27 Thread Hakim (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914770#comment-13914770
 ] 

Hakim commented on SOLR-5787:
-

Because the frequency returned with each suggestion word, count the occurences 
in the WHOLE index instead of counting it only for the documents satisfying 
current lucene query.

 Get spellcheck frequency relatively to current query
 

 Key: SOLR-5787
 URL: https://issues.apache.org/jira/browse/SOLR-5787
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Affects Versions: 4.6
 Environment: Solr deployed on Jetty 9 Servlet container
Reporter: Hakim
Priority: Minor
  Labels: features, newbie

 I guess that this functionnality isn't implemented yet. I'll begin by an 
 example to explain what I'm requesting:
 I have a lucene query that get articles satisfying a certain query. With this 
 same command, I'm getting at the same time suggestions if this query doesnt 
 return any article (so far, nothing unusual). 
 The frequency (count) associated with these suggestions is relative to all 
 index (it counts all occurences of the suggestion in the whole index). What I 
 want is that it counts only suggestion occurences satisfying current lucene 
 query.
 P.S: I'm using solr's spellcheck component (solr.DirectSolrSpellChecker).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914795#comment-13914795
 ] 

ASF subversion and git services commented on LUCENE-5468:
-

Commit 1572666 from [~rcmuir] in branch 'dev/branches/lucene5468'
[ https://svn.apache.org/r1572666 ]

LUCENE-5468: convert affixes to FST

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914799#comment-13914799
 ] 

Robert Muir commented on LUCENE-5468:
-

I am finished compressing for now. I think its pretty reasonable across all the 
languages.

I will cleanup and try to add back the multiple dictionary/ignore case stuff 
and clean up some other things.

||dict||old RAM||new RAM||
|af_ZA.zip|18 MB|917.1 KB|
|ak_GH.zip|1.5 MB|103.2 KB|
|bg_BG.zip|FAIL|465.7 KB|
|ca_ANY.zip|28.9 MB|675.4 KB|
|ca_ES.zip|15.1 MB|639.8 KB|
|cop_EG.zip|2.1 MB|144.5 KB|
|cs_CZ.zip|50.4 MB|1.5 MB|
|cy_GB.zip|FAIL|627.4 KB|
|da_DK.zip|FAIL|669.8 KB|
|de_AT.zip|1.3 MB|123.9 KB|
|de_CH.zip|12.6 MB|725.4 KB|
|de_DE.zip|12.6 MB|726 KB|
|de_DE_comb.zip|102.2 MB|4.2 MB|
|de_DE_frami.zip|20.9 MB|1023.5 KB|
|de_DE_neu.zip|101.5 MB|4.2 MB|
|el_GR.zip|74.3 MB|1 MB|
|en_AU.zip|8.1 MB|521 KB|
|en_CA.zip|9.8 MB|450.5 KB|
|en_GB-oed.zip|8.2 MB|526.6 KB|
|en_GB.zip|8.3 MB|527.3 KB|
|en_NZ.zip|8.4 MB|532.4 KB|
|eo.zip|4.9 MB|310.5 KB|
|eo_EO.zip|4.9 MB|310.5 KB|
|es_AR.zip|14.8 MB|734.9 KB|
|es_BO.zip|14.8 MB|735 KB|
|es_CL.zip|14.7 MB|734.9 KB|
|es_CO.zip|14.3 MB|722.1 KB|
|es_CR.zip|14.8 MB|733.9 KB|
|es_CU.zip|14.7 MB|732.8 KB|
|es_DO.zip|14.7 MB|731.9 KB|
|es_EC.zip|14.8 MB|733.5 KB|
|es_ES.zip|15.1 MB|743 KB|
|es_GT.zip|14.8 MB|734.5 KB|
|es_HN.zip|14.8 MB|735.2 KB|
|es_MX.zip|14.3 MB|723.8 KB|
|es_NEW.zip|15.5 MB|768.5 KB|
|es_NI.zip|14.8 MB|734.5 KB|
|es_PA.zip|14.8 MB|733.8 KB|
|es_PE.zip|14.2 MB|721.3 KB|
|es_PR.zip|14.7 MB|732.4 KB|
|es_PY.zip|14.8 MB|734.1 KB|
|es_SV.zip|14.8 MB|733.6 KB|
|es_UY.zip|14.8 MB|736.9 KB|
|es_VE.zip|14.3 MB|722.7 KB|
|et_EE.zip|53.6 MB|473.6 KB|
|fo_FO.zip|18.6 MB|517.9 KB|
|fr_FR-1990_1-3-2.zip|14 MB|526.7 KB|
|fr_FR-classique_1-3-2.zip|14 MB|539.2 KB|
|fr_FR_1-3-2.zip|14.5 MB|550.4 KB|
|fy_NL.zip|4.2 MB|265.6 KB|
|ga_IE.zip|14 MB|460.6 KB|
|gd_GB.zip|2.7 MB|143.1 KB|
|gl_ES.zip|FAIL|479.4 KB|
|gsc_FR.zip|FAIL|1.3 MB|
|gu_IN.zip|20.3 MB|947 KB|
|he_IL.zip|53.3 MB|539.2 KB|
|hi_IN.zip|2.7 MB|169 KB|
|hil_PH.zip|3.4 MB|197 KB|
|hr_HR.zip|29.7 MB|573 KB|
|hu_HU.zip|FAIL|1.2 MB|
|hu_HU_comb.zip|FAIL|5.4 MB|
|ia.zip|4.9 MB|222.9 KB|
|id_ID.zip|3.9 MB|226.3 KB|
|it_IT.zip|15.3 MB|612.9 KB|
|ku_TR.zip|1.6 MB|118.7 KB|
|la.zip|5.1 MB|199.3 KB|
|lt_LT.zip|15 MB|682.5 KB|
|lv_LV.zip|36.3 MB|763.9 KB|
|mg_MG.zip|2.9 MB|163.8 KB|
|mi_NZ.zip|FAIL|191.4 KB|
|mk_MK.zip|FAIL|469.1 KB|
|mos_BF.zip|13.3 MB|242.2 KB|
|mr_IN.zip|FAIL|147.7 KB|
|ms_MY.zip|4.1 MB|226.9 KB|
|nb_NO.zip|22.9 MB|1.2 MB|
|ne_NP.zip|5.5 MB|328.1 KB|
|nl_NL.zip|22.9 MB|1.1 MB|
|nl_med.zip|1.2 MB|92.3 KB|
|nn_NO.zip|16.5 MB|914 KB|
|nr_ZA.zip|3.1 MB|203.3 KB|
|ns_ZA.zip|1.7 MB|118 KB|
|ny_MW.zip|FAIL|101.8 KB|
|oc_FR.zip|9.1 MB|401.5 KB|
|pl_PL.zip|43.9 MB|1.7 MB|
|pt_BR.zip|FAIL|2.1 MB|
|pt_PT.zip|5.8 MB|379.4 KB|
|ro_RO.zip|5.1 MB|256.3 KB|
|ru_RU.zip|21.7 MB|882 KB|
|ru_RU_ye.zip|43.7 MB|1.5 MB|
|ru_RU_yo.zip|21.7 MB|897.3 KB|
|rw_RW.zip|1.6 MB|102.3 KB|
|sk_SK.zip|25.1 MB|1.2 MB|
|sl_SI.zip|38.3 MB|604 KB||af_ZA.zip|18 MB|917.1 KB|
|ak_GH.zip|1.5 MB|103.2 KB|
|bg_BG.zip|FAIL|465.7 KB|
|ca_ANY.zip|28.9 MB|675.4 KB|
|ca_ES.zip|15.1 MB|639.8 KB|
|cop_EG.zip|2.1 MB|144.5 KB|
|cs_CZ.zip|50.4 MB|1.5 MB|
|cy_GB.zip|FAIL|627.4 KB|
|da_DK.zip|FAIL|669.8 KB|
|de_AT.zip|1.3 MB|123.9 KB|
|de_CH.zip|12.6 MB|725.4 KB|
|de_DE.zip|12.6 MB|726 KB|
|de_DE_comb.zip|102.2 MB|4.2 MB|
|de_DE_frami.zip|20.9 MB|1023.5 KB|
|de_DE_neu.zip|101.5 MB|4.2 MB|
|el_GR.zip|74.3 MB|1 MB|
|en_AU.zip|8.1 MB|521 KB|
|en_CA.zip|9.8 MB|450.5 KB|
|en_GB-oed.zip|8.2 MB|526.6 KB|
|en_GB.zip|8.3 MB|527.3 KB|
|en_NZ.zip|8.4 MB|532.4 KB|
|eo.zip|4.9 MB|310.5 KB|
|eo_EO.zip|4.9 MB|310.5 KB|
|es_AR.zip|14.8 MB|734.9 KB|
|es_BO.zip|14.8 MB|735 KB|
|es_CL.zip|14.7 MB|734.9 KB|
|es_CO.zip|14.3 MB|722.1 KB|
|es_CR.zip|14.8 MB|733.9 KB|
|es_CU.zip|14.7 MB|732.8 KB|
|es_DO.zip|14.7 MB|731.9 KB|
|es_EC.zip|14.8 MB|733.5 KB|
|es_ES.zip|15.1 MB|743 KB|
|es_GT.zip|14.8 MB|734.5 KB|
|es_HN.zip|14.8 MB|735.2 KB|
|es_MX.zip|14.3 MB|723.8 KB|
|es_NEW.zip|15.5 MB|768.5 KB|
|es_NI.zip|14.8 MB|734.5 KB|
|es_PA.zip|14.8 MB|733.8 KB|
|es_PE.zip|14.2 MB|721.3 KB|
|es_PR.zip|14.7 MB|732.4 KB|
|es_PY.zip|14.8 MB|734.1 KB|
|es_SV.zip|14.8 MB|733.6 KB|
|es_UY.zip|14.8 MB|736.9 KB|
|es_VE.zip|14.3 MB|722.7 KB|
|et_EE.zip|53.6 MB|473.6 KB|
|fo_FO.zip|18.6 MB|517.9 KB|
|fr_FR-1990_1-3-2.zip|14 MB|526.7 KB|
|fr_FR-classique_1-3-2.zip|14 MB|539.2 KB|
|fr_FR_1-3-2.zip|14.5 MB|550.4 KB|
|fy_NL.zip|4.2 MB|265.6 KB|
|ga_IE.zip|14 MB|460.6 KB|
|gd_GB.zip|2.7 MB|143.1 KB|
|gl_ES.zip|FAIL|479.4 KB|
|gsc_FR.zip|FAIL|1.3 MB|
|gu_IN.zip|20.3 MB|947 KB|
|he_IL.zip|53.3 MB|539.2 KB|
|hi_IN.zip|2.7 MB|169 KB|
|hil_PH.zip|3.4 MB|197 KB|
|hr_HR.zip|29.7 MB|573 KB|
|hu_HU.zip|FAIL|1.2 MB|
|hu_HU_comb.zip|FAIL|5.4 MB|
|ia.zip|4.9 MB|222.9 KB|
|id_ID.zip|3.9 MB|226.3 KB|
|it_IT.zip|15.3 MB|612.9 KB|
|ku_TR.zip|1.6 MB|118.7 KB|
|la.zip|5.1

Re: VOTE: Release apache-solr-ref-guide-4.7.pdf (RC1)

2014-02-27 Thread Steve Rowe

I posted some problematic pages exported to PDF using the revised CSS on 
SOLR-5753 - looks better to me.

Nevertheless, I vote +1 for the RC1 Solr Reference Guide. 

The issues I brought up are cosmetic ones, and will be addressed in the next 
version.  Let’s get it out the door!

Steve

On Feb 27, 2014, at 12:06 PM, Cassandra Targett casstarg...@gmail.com wrote:

 
 On Thu, Feb 27, 2014 at 4:11 AM, Steve Rowe sar...@gmail.com wrote:
 
 I noticed that a couple of “Topics covered in this section” boxes are too 
 narrow to allow their content to be legible - the ones on pages 251 and 300 
 look really bad, and some others are only marginally legible.
 
 
 I finally found a fix for this problem (which is a more egregious example of 
 the same issue Hoss filed as SOLR-5753) on this Atlassian issue: 
 https://jira.atlassian.com/browse/CONF-14758. However it states that it's for 
 Confluence 5.3+, so I don't know if it will work with 5.0.3 which is the 
 version CWIKI is on.
 
 Maybe worth a try? I've posted the possible solution to SOLR-5753 so maybe if 
 you put it in the PDF CSS we can see how it works.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5791) DistributedQueryElevationComponentTest routinely fails on J9

2014-02-27 Thread Hoss Man (JIRA)

Hoss Man created SOLR-5791:
--

 Summary: DistributedQueryElevationComponentTest routinely fails on 
J9
 Key: SOLR-5791
 URL: https://issues.apache.org/jira/browse/SOLR-5791
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man


Either there is a bug in how the params are handled that only manifests itself 
in J9, or the test needs fixed to not expect the params in a certain order

{noformat}
REGRESSION:  
org.apache.solr.handler.component.DistributedQueryElevationComponentTest.testDistribSearch

Error Message:
.responseHeader.params.fl!=version (unordered or missing)

Stack Trace:
junit.framework.AssertionFailedError: .responseHeader.params.fl!=version 
(unordered or missing)
at 
__randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0)
at junit.framework.Assert.fail(Assert.java:50)
at
org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843)
at 
org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524)
at
org.apache.solr.handler.component.DistributedQueryElevationComponentTest.doTest(DistributedQueryElevationComponentTe
st.java:81)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:870)

{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5792) TermVectorComponentDistributedTest routinely fails on J9

2014-02-27 Thread Hoss Man (JIRA)

Hoss Man created SOLR-5792:
--

 Summary: TermVectorComponentDistributedTest routinely fails on J9
 Key: SOLR-5792
 URL: https://issues.apache.org/jira/browse/SOLR-5792
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man


Perhaps the code is using a Map when it should be using a NamedList? or perhaps 
the test should be configured not to care about the order .. is hte order 
meaningful in this part of the output?

{noformat}
REGRESSION:  
org.apache.solr.handler.component.TermVectorComponentDistributedTest.testDistribSearch

Error Message:
.termVectors.0.test_basictv!=test_postv (unordered or missing)

Stack Trace:
junit.framework.AssertionFailedError: .termVectors.0.test_basictv!=test_postv 
(unordered or missing)
at 
__randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0)
at junit.framework.Assert.fail(Assert.java:50)
at
org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843)
at 
org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524)
at
org.apache.solr.handler.component.TermVectorComponentDistributedTest.doTest(TermVectorComponentDistributedTest.java:
164)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:876)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester

2014-02-27 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5477:
---

Attachment: LUCENE-5477.patch

Initial patch, with new add/update/refresh methods added to 
AnalyzingInfixSuggester.

I added a testBasicNRT and it seems to pass, but I still need to add a 
randomized test.

I think the approach will work well: I'm just using SortingMergePolicy and 
EarlyTerminatingSortingCollector, and I switched to SearcherManager to pull the 
current searcher.


 add near-real-time suggest building to AnalyzingInfixSuggester
 --

 Key: LUCENE-5477
 URL: https://issues.apache.org/jira/browse/LUCENE-5477
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Michael McCandless
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5477.patch


 Because this suggester impl. is just a Lucene index under-the-hood, it should 
 be straightforward to enable near-real-time additions/removals of suggestions.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5793) SignatureUpdateProcessorFactoryTest routinely fails on J9

2014-02-27 Thread Hoss Man (JIRA)

Hoss Man created SOLR-5793:
--

 Summary: SignatureUpdateProcessorFactoryTest routinely fails on J9
 Key: SOLR-5793
 URL: https://issues.apache.org/jira/browse/SOLR-5793
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man


Two very similar looking failures pop up frequently, but not always together...

{noformat}
REGRESSION:  
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded

Error Message:
expected:1 but was:2

Stack Trace:
java.lang.AssertionError: expected:1 but was:2
at 
__randomizedtesting.SeedInfo.seed([791041A112471F1D:18859B41FA9615EB]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71)
at 
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded(SignatureUpdateProcessorFactoryTest.java:222)
{noformat}

{noformat}
REGRESSION:  
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection

Error Message:
expected:1 but was:2

Stack Trace:
java.lang.AssertionError: expected:1 but was:2
at 
__randomizedtesting.SeedInfo.seed([16A8922439B48E61:4D9869EC3AF32D1D]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71)
at 
org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection(SignatureUpdateProcessorFactoryTest.java:119)

{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5791) DistributedQueryElevationComponentTest routinely fails on J9


[ 
https://issues.apache.org/jira/browse/SOLR-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914917#comment-13914917
 ] 

ASF subversion and git services commented on SOLR-5791:
---

Commit 1572706 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1572706 ]

SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM

 DistributedQueryElevationComponentTest routinely fails on J9
 

 Key: SOLR-5791
 URL: https://issues.apache.org/jira/browse/SOLR-5791
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man

 Either there is a bug in how the params are handled that only manifests 
 itself in J9, or the test needs fixed to not expect the params in a certain 
 order
 {noformat}
 REGRESSION:  
 org.apache.solr.handler.component.DistributedQueryElevationComponentTest.testDistribSearch
 Error Message:
 .responseHeader.params.fl!=version (unordered or missing)
 Stack Trace:
 junit.framework.AssertionFailedError: .responseHeader.params.fl!=version 
 (unordered or missing)
 at 
 __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0)
 at junit.framework.Assert.fail(Assert.java:50)
 at
 org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524)
 at
 org.apache.solr.handler.component.DistributedQueryElevationComponentTest.doTest(DistributedQueryElevationComponentTe
 st.java:81)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:870)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5793) SignatureUpdateProcessorFactoryTest routinely fails on J9

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914915#comment-13914915
 ] 

ASF subversion and git services commented on SOLR-5793:
---

Commit 1572706 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1572706 ]

SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM

 SignatureUpdateProcessorFactoryTest routinely fails on J9
 -

 Key: SOLR-5793
 URL: https://issues.apache.org/jira/browse/SOLR-5793
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man

 Two very similar looking failures pop up frequently, but not always 
 together...
 {noformat}
 REGRESSION:  
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded
 Error Message:
 expected:1 but was:2
 Stack Trace:
 java.lang.AssertionError: expected:1 but was:2
   at 
 __randomizedtesting.SeedInfo.seed([791041A112471F1D:18859B41FA9615EB]:0)
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded(SignatureUpdateProcessorFactoryTest.java:222)
 {noformat}
 {noformat}
 REGRESSION:  
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection
 Error Message:
 expected:1 but was:2
 Stack Trace:
 java.lang.AssertionError: expected:1 but was:2
   at 
 __randomizedtesting.SeedInfo.seed([16A8922439B48E61:4D9869EC3AF32D1D]:0)
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection(SignatureUpdateProcessorFactoryTest.java:119)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5792) TermVectorComponentDistributedTest routinely fails on J9

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914916#comment-13914916
 ] 

ASF subversion and git services commented on SOLR-5792:
---

Commit 1572706 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1572706 ]

SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM

 TermVectorComponentDistributedTest routinely fails on J9
 

 Key: SOLR-5792
 URL: https://issues.apache.org/jira/browse/SOLR-5792
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man

 Perhaps the code is using a Map when it should be using a NamedList? or 
 perhaps the test should be configured not to care about the order .. is hte 
 order meaningful in this part of the output?
 {noformat}
 REGRESSION:  
 org.apache.solr.handler.component.TermVectorComponentDistributedTest.testDistribSearch
 Error Message:
 .termVectors.0.test_basictv!=test_postv (unordered or missing)
 Stack Trace:
 junit.framework.AssertionFailedError: .termVectors.0.test_basictv!=test_postv 
 (unordered or missing)
 at 
 __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0)
 at junit.framework.Assert.fail(Assert.java:50)
 at
 org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524)
 at
 org.apache.solr.handler.component.TermVectorComponentDistributedTest.doTest(TermVectorComponentDistributedTest.java:
 164)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:876)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5791) DistributedQueryElevationComponentTest routinely fails on J9

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914924#comment-13914924
 ] 

ASF subversion and git services commented on SOLR-5791:
---

Commit 1572709 from hoss...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1572709 ]

SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM (merge 
r1572706)

 DistributedQueryElevationComponentTest routinely fails on J9
 

 Key: SOLR-5791
 URL: https://issues.apache.org/jira/browse/SOLR-5791
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man

 Either there is a bug in how the params are handled that only manifests 
 itself in J9, or the test needs fixed to not expect the params in a certain 
 order
 {noformat}
 REGRESSION:  
 org.apache.solr.handler.component.DistributedQueryElevationComponentTest.testDistribSearch
 Error Message:
 .responseHeader.params.fl!=version (unordered or missing)
 Stack Trace:
 junit.framework.AssertionFailedError: .responseHeader.params.fl!=version 
 (unordered or missing)
 at 
 __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0)
 at junit.framework.Assert.fail(Assert.java:50)
 at
 org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524)
 at
 org.apache.solr.handler.component.DistributedQueryElevationComponentTest.doTest(DistributedQueryElevationComponentTe
 st.java:81)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:870)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5792) TermVectorComponentDistributedTest routinely fails on J9

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914923#comment-13914923
 ] 

ASF subversion and git services commented on SOLR-5792:
---

Commit 1572709 from hoss...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1572709 ]

SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM (merge 
r1572706)

 TermVectorComponentDistributedTest routinely fails on J9
 

 Key: SOLR-5792
 URL: https://issues.apache.org/jira/browse/SOLR-5792
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man

 Perhaps the code is using a Map when it should be using a NamedList? or 
 perhaps the test should be configured not to care about the order .. is hte 
 order meaningful in this part of the output?
 {noformat}
 REGRESSION:  
 org.apache.solr.handler.component.TermVectorComponentDistributedTest.testDistribSearch
 Error Message:
 .termVectors.0.test_basictv!=test_postv (unordered or missing)
 Stack Trace:
 junit.framework.AssertionFailedError: .termVectors.0.test_basictv!=test_postv 
 (unordered or missing)
 at 
 __randomizedtesting.SeedInfo.seed([C6763A182C2489BA:4790B4005B7BE986]:0)
 at junit.framework.Assert.fail(Assert.java:50)
 at
 org.apache.solr.BaseDistributedSearchTestCase.compareSolrResponses(BaseDistributedSearchTestCase.java:843)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:862)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:565)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:545)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:524)
 at
 org.apache.solr.handler.component.TermVectorComponentDistributedTest.doTest(TermVectorComponentDistributedTest.java:
 164)
 at 
 org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:876)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5793) SignatureUpdateProcessorFactoryTest routinely fails on J9

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914922#comment-13914922
 ] 

ASF subversion and git services commented on SOLR-5793:
---

Commit 1572709 from hoss...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1572709 ]

SOLR-5793, SOLR-5792, SOLR-5791: disable these three tests on J9 JVM (merge 
r1572706)

 SignatureUpdateProcessorFactoryTest routinely fails on J9
 -

 Key: SOLR-5793
 URL: https://issues.apache.org/jira/browse/SOLR-5793
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man

 Two very similar looking failures pop up frequently, but not always 
 together...
 {noformat}
 REGRESSION:  
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded
 Error Message:
 expected:1 but was:2
 Stack Trace:
 java.lang.AssertionError: expected:1 but was:2
   at 
 __randomizedtesting.SeedInfo.seed([791041A112471F1D:18859B41FA9615EB]:0)
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded(SignatureUpdateProcessorFactoryTest.java:222)
 {noformat}
 {noformat}
 REGRESSION:  
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection
 Error Message:
 expected:1 but was:2
 Stack Trace:
 java.lang.AssertionError: expected:1 but was:2
   at 
 __randomizedtesting.SeedInfo.seed([16A8922439B48E61:4D9869EC3AF32D1D]:0)
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.checkNumDocs(SignatureUpdateProcessorFactoryTest.java:71)
   at 
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testDupeDetection(SignatureUpdateProcessorFactoryTest.java:119)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester


[ 
https://issues.apache.org/jira/browse/LUCENE-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914934#comment-13914934
 ] 

Robert Muir commented on LUCENE-5477:
-

this looks great!

 add near-real-time suggest building to AnalyzingInfixSuggester
 --

 Key: LUCENE-5477
 URL: https://issues.apache.org/jira/browse/LUCENE-5477
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Michael McCandless
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5477.patch


 Because this suggester impl. is just a Lucene index under-the-hood, it should 
 be straightforward to enable near-real-time additions/removals of suggestions.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5779) REGRESSION: org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded


 [ 
https://issues.apache.org/jira/browse/SOLR-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-5779.
---

Resolution: Duplicate

 REGRESSION:  
 org.apache.solr.update.processor.SignatureUpdateProcessorFactoryTest.testMultiThreaded
 ---

 Key: SOLR-5779
 URL: https://issues.apache.org/jira/browse/SOLR-5779
 Project: Solr
  Issue Type: Bug
Reporter: Mark Miller

 On the face of it, this fail that start not too long ago is saying that this 
 is no longer thread safe.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary

2014-02-27 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914954#comment-13914954
 ] 

Chris Male commented on LUCENE-5468:


Those are some pretty amazing reductions, well done!

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914960#comment-13914960
 ] 

Robert Muir commented on LUCENE-5468:
-

I have the previous options added back too locally. so i will fix up tests and 
so on and just copy over the old filter and make a patch. 

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5477) add near-real-time suggest building to AnalyzingInfixSuggester

2014-02-27 Thread Areek Zillur (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13914978#comment-13914978
 ] 

Areek Zillur commented on LUCENE-5477:
--

Wow, that looks awesome! Thanks for getting rid of the redundant casting of 
InputIterator too.

 add near-real-time suggest building to AnalyzingInfixSuggester
 --

 Key: LUCENE-5477
 URL: https://issues.apache.org/jira/browse/LUCENE-5477
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Michael McCandless
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5477.patch


 Because this suggester impl. is just a Lucene index under-the-hood, it should 
 be straightforward to enable near-real-time additions/removals of suggestions.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915004#comment-13915004
 ] 

ASF subversion and git services commented on LUCENE-5468:
-

Commit 1572718 from [~rcmuir] in branch 'dev/branches/lucene5468'
[ https://svn.apache.org/r1572718 ]

LUCENE-5468: hunspell2 - hunspell (with previous options and tests)

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5428) new statistics results to StatsComponent - distinctValues and countDistinct

2014-02-27 Thread Steven Bower (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915008#comment-13915008
 ] 

Steven Bower commented on SOLR-5428:


does this work on multi-value fields?

 new statistics results to StatsComponent - distinctValues and countDistinct
 ---

 Key: SOLR-5428
 URL: https://issues.apache.org/jira/browse/SOLR-5428
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
Assignee: Shalin Shekhar Mangar
 Fix For: 4.7, 5.0

 Attachments: SOLR-5428.patch, SOLR-5428.patch


 I thought it would be very useful to display the distinct values (and the 
 count) of a field among other statistics. Attached a patch implementing this 
 in StatsComponent.
 Added results  :
 distinctValues - list of all distnict values
 countDistinct -  distnict values count.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5183) Add block support for JSONLoader

2014-02-27 Thread Varun Thacker (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-5183:


Attachment: SOLR-5183.patch

New patch which takes into account changes made on SOLR-5777

 Add block support for JSONLoader
 

 Key: SOLR-5183
 URL: https://issues.apache.org/jira/browse/SOLR-5183
 Project: Solr
  Issue Type: Sub-task
Reporter: Varun Thacker
 Fix For: 4.7

 Attachments: SOLR-5183.patch, SOLR-5183.patch, SOLR-5183.patch, 
 SOLR-5183.patch, SOLR-5183.patch


 We should be able to index block documents in JSON format



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915025#comment-13915025
 ] 

ASF subversion and git services commented on LUCENE-5468:
-

Commit 1572724 from [~rcmuir] in branch 'dev/branches/lucene5468'
[ https://svn.apache.org/r1572724 ]

LUCENE-5468: fix precommit+test

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5468) Hunspell very high memory use when loading dictionary

2014-02-27 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13915028#comment-13915028
 ] 

ASF subversion and git services commented on LUCENE-5468:
-

Commit 1572727 from [~rcmuir] in branch 'dev/branches/lucene5468'
[ https://svn.apache.org/r1572727 ]

LUCENE-5468: add additional change

 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5468) Hunspell very high memory use when loading dictionary


 [ 
https://issues.apache.org/jira/browse/LUCENE-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5468:


Attachment: LUCENE-5468.patch

I think the change is ready. There are other improvements that can be done (for 
example, maybe an option for the factory to cache these things in case you use 
same ones across multiple fields, and more efficient affix handling against the 
FST, and so on), but it would be better on different issues I think?

Here is a patch (from diff-sources), sorry its not so useful, as I renamed some 
things. I tried making one from svn diff after reintegration, but it was 
equally useless. If you want you can also review my commits on this issue to 
the branch, too.

here is CHANGES entry:

API Changes:

* LUCENE-5468: Move offline Sort (from suggest module) to OfflineSort. (Robert 
Muir)

Optimizations:

* LUCENE-5468: HunspellStemFilter uses 10 to 100x less RAM. It also loads
  all known openoffice dictionaries without error, and supports an additional
  longestOnly option for a less aggressive approach.  (Robert Muir)



 Hunspell very high memory use when loading dictionary
 -

 Key: LUCENE-5468
 URL: https://issues.apache.org/jira/browse/LUCENE-5468
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 3.5
Reporter: Maciej Lisiewski
Priority: Minor
 Attachments: LUCENE-5468.patch, patch.txt


 Hunspell stemmer requires gigantic (for the task) amounts of memory to load 
 dictionary/rules files. 
 For example loading a 4.5 MB polish dictionary (with empty index!) will cause 
 whole core to crash with various out of memory errors unless you set max heap 
 size close to 2GB or more.
 By comparison Stempel using the same dictionary file works just fine with 1/8 
 of that (and possibly lower values as well).
 Sample error log entries:
 http://pastebin.com/fSrdd5W1
 http://pastebin.com/Lmi0re7Z



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5605) MapReduceIndexerTool fails in some locales -- seen in random failures of MapReduceIndexerToolArgumentParserTest