date:20130112


[ 
https://issues.apache.org/jira/browse/SOLR-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551893#comment-13551893
 ] 

Commit Tag Bot commented on SOLR-3735:
--

[branch_4x commit] Erik Hatcher
http://svn.apache.org/viewvc?view=revisionrevision=1432410

SOLR-3735: Relocate the example mime-to-extension mapping (merge from trunk)


 Relocate the example mime-to-extension mapping
 --

 Key: SOLR-3735
 URL: https://issues.apache.org/jira/browse/SOLR-3735
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0-BETA, 4.0
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-3735.patch


 A mime-to-extension mapping was added to VelocityResponseWriter recently.  
 This really belongs in the templates themselves, not in VrW, as it is 
 specific to the example search results not meant for all VrW templates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3735) Relocate the example mime-to-extension mapping


[ 
https://issues.apache.org/jira/browse/SOLR-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551911#comment-13551911
 ] 

Commit Tag Bot commented on SOLR-3735:
--

[trunk commit] Erik Hatcher
http://svn.apache.org/viewvc?view=revisionrevision=1432411

SOLR-3735: merged to 4x, so adjust CHANGES


 Relocate the example mime-to-extension mapping
 --

 Key: SOLR-3735
 URL: https://issues.apache.org/jira/browse/SOLR-3735
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0-BETA, 4.0
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-3735.patch


 A mime-to-extension mapping was added to VelocityResponseWriter recently.  
 This really belongs in the templates themselves, not in VrW, as it is 
 specific to the example search results not meant for all VrW templates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4678) FST should use paged byte[] instead of single contiguous byte[]

[
https://issues.apache.org/jira/browse/LUCENE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551929#comment-13551929
]

Robert Muir commented on LUCENE-4678:
-

{quote}
I'll commit only to trunk for now ... and backport to 4.2 once 4.1 branches and
once this has baked some in trunk ...
{quote}

+1... the copyBytes is frightening though!

What do you think of the FST.BytesReader - FSTBytesReader? I'm just thinking
it causes a lot of api noise (you can see it in the patch).
Unfortunately lots of users have to create this thing to pass to methods on FST
(e.g. findTargetArc).

So if we kept it as FST.BytesReader they would be largely unaffected?

FST should use paged byte[] instead of single contiguous byte[]
---

Key: LUCENE-4678
URL: https://issues.apache.org/jira/browse/LUCENE-4678
Project: Lucene - Core
Issue Type: Improvement
Components: core/FSTs
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: 4.2, 5.0

Attachments: LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch

The single byte[] we use today has several limitations, eg it limits us to
2.1 GB FSTs (and suggesters in the wild are getting close to this limit), and
it causes big RAM spikes during building when a the array has to grow.
I took basically the same approach as LUCENE-3298, but I want to break out
this patch separately from changing all int - long for 2.1 GB support.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

Michael McCandless created LUCENE-4682:
--

 Summary: Reduce wasted bytes in FST due to array arcs
 Key: LUCENE-4682
 URL: https://issues.apache.org/jira/browse/LUCENE-4682
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Priority: Minor


When a node is close to the root, or it has many outgoing arcs, the FST writes 
the arcs as an array (each arc gets N bytes), so we can e.g. bin search on 
lookup.

The problem is N is set to the max(numBytesPerArc), so if you have an outlier 
arc e.g. with a big output, you can waste many bytes for all the other arcs 
that didn't need so many bytes.

I generated Kuromoji's FST and found it has 271187 wasted bytes vs total size 
1535612 = ~18% wasted.

It would be nice to reduce this.

One thing we could do without packing is: in addNode, if we detect that number 
of wasted bytes is above some threshold, then don't do the expansion.

Another thing, if we are packing: we could record stats in the first pass about 
which nodes wasted the most, and then in the second pass (paack) we could set 
the threshold based on the top X% nodes that waste ...

Another idea is maybe to deref large outputs, so that the numBytesPerArc is 
more uniform ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551932#comment-13551932
]

Michael McCandless commented on LUCENE-4682:

A couple more ideas:

* Since the root arc is [usually?] cached ... we [usually] shouldn't make the
root node into an array?

* The building process sometimes has freedom in where the outputs are pushed
... so in theory we could push the outputs forwards if it would mean fewer
wasted bytes on the prior node ... this would be a tricky optimization problem
I think.

Reduce wasted bytes in FST due to array arcs

Key: LUCENE-4682
URL: https://issues.apache.org/jira/browse/LUCENE-4682
Project: Lucene - Core
Issue Type: Improvement
Components: core/FSTs
Reporter: Michael McCandless
Priority: Minor

When a node is close to the root, or it has many outgoing arcs, the FST
writes the arcs as an array (each arc gets N bytes), so we can e.g. bin
search on lookup.
The problem is N is set to the max(numBytesPerArc), so if you have an outlier
arc e.g. with a big output, you can waste many bytes for all the other arcs
that didn't need so many bytes.
I generated Kuromoji's FST and found it has 271187 wasted bytes vs total size
1535612 = ~18% wasted.
It would be nice to reduce this.
One thing we could do without packing is: in addNode, if we detect that
number of wasted bytes is above some threshold, then don't do the expansion.
Another thing, if we are packing: we could record stats in the first pass
about which nodes wasted the most, and then in the second pass (paack) we
could set the threshold based on the top X% nodes that waste ...
Another idea is maybe to deref large outputs, so that the numBytesPerArc is
more uniform ...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs


[ 
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551934#comment-13551934
 ] 

Michael McCandless commented on LUCENE-4682:


Maybe we should just tighten up the FST thresholds for when we make an array 
arc:
{noformat}
  /**
   * @see #shouldExpand(UnCompiledNode)
   */
  final static int FIXED_ARRAY_SHALLOW_DISTANCE = 3; // 0 = only root node.

  /**
   * @see #shouldExpand(UnCompiledNode)
   */
  final static int FIXED_ARRAY_NUM_ARCS_SHALLOW = 5;

  /**
   * @see #shouldExpand(UnCompiledNode)
   */
  final static int FIXED_ARRAY_NUM_ARCS_DEEP = 10;
{noformat}

When I print out the waste, it's generally the smaller nodes that have higher 
proportional waste:
{noformat}
 [java] waste: 44 numArcs=16 perArc=2.75
 [java] waste: 20 numArcs=11 perArc=1.8181819
 [java] waste: 13 numArcs=5 perArc=2.6
 [java] waste: 20 numArcs=12 perArc=1.666
 [java] waste: 60 numArcs=20 perArc=3.0
 [java] waste: 0 numArcs=5 perArc=0.0
 [java] waste: 48 numArcs=15 perArc=3.2
 [java] waste: 16 numArcs=5 perArc=3.2
 [java] waste: 20 numArcs=6 perArc=3.333
 [java] waste: 8 numArcs=6 perArc=1.334
 [java] waste: 24 numArcs=8 perArc=3.0
 [java] waste: 32 numArcs=9 perArc=3.556
 [java] waste: 17 numArcs=7 perArc=2.4285715
 [java] waste: 13 numArcs=5 perArc=2.6
 [java] waste: 17 numArcs=6 perArc=2.833
 [java] waste: 28 numArcs=8 perArc=3.5
 [java] waste: 20 numArcs=16 perArc=1.25
 [java] waste: 44 numArcs=15 perArc=2.934
 [java] waste: 28 numArcs=13 perArc=2.1538463
 [java] waste: 28 numArcs=15 perArc=1.867
{noformat}

 Reduce wasted bytes in FST due to array arcs
 

 Key: LUCENE-4682
 URL: https://issues.apache.org/jira/browse/LUCENE-4682
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Priority: Minor

 When a node is close to the root, or it has many outgoing arcs, the FST 
 writes the arcs as an array (each arc gets N bytes), so we can e.g. bin 
 search on lookup.
 The problem is N is set to the max(numBytesPerArc), so if you have an outlier 
 arc e.g. with a big output, you can waste many bytes for all the other arcs 
 that didn't need so many bytes.
 I generated Kuromoji's FST and found it has 271187 wasted bytes vs total size 
 1535612 = ~18% wasted.
 It would be nice to reduce this.
 One thing we could do without packing is: in addNode, if we detect that 
 number of wasted bytes is above some threshold, then don't do the expansion.
 Another thing, if we are packing: we could record stats in the first pass 
 about which nodes wasted the most, and then in the second pass (paack) we 
 could set the threshold based on the top X% nodes that waste ...
 Another idea is maybe to deref large outputs, so that the numBytesPerArc is 
 more uniform ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless updated LUCENE-4682:
---

Attachment: kuromoji.wasted.bytes.txt

Shows the wasted bytes ... one line per node whose arcs were turned into an
array, sorted by net bytes wasted.

Reduce wasted bytes in FST due to array arcs

Key: LUCENE-4682
URL: https://issues.apache.org/jira/browse/LUCENE-4682
Project: Lucene - Core
Issue Type: Improvement
Components: core/FSTs
Reporter: Michael McCandless
Priority: Minor
Attachments: kuromoji.wasted.bytes.txt

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551938#comment-13551938
]

Robert Muir commented on LUCENE-4682:
-

As an experiment i turned off array arcs for kuromoji in my trunk checkout:

FST
before: [java] 53645 nodes, 253185 arcs, 1535612 bytes... done
after: [java] 53645 nodes, 253185 arcs, 1228816 bytes... done

JAR
before: rw-rw-r- 1 rmuir rmuir 4581420 Jan 12 09:56
lucene-analyzers-kuromoji-4.1-SNAPSHOT.jar
after: rw-rw-r- 1 rmuir rmuir 4306792 Jan 12 09:56
lucene-analyzers-kuromoji-5.0-SNAPSHOT.jar

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551939#comment-13551939
]

Michael McCandless commented on LUCENE-4682:

Even more than the 271,187 I measured (20% smaller FST), I think because the
FST is now smaller we use fewer bytes writing the delta-coded node addresses ...

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs


[ 
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551940#comment-13551940
 ] 

Robert Muir commented on LUCENE-4682:
-

in the fixedArray case:
{code}
// write a false first arc:
writer.writeByte(ARCS_AS_FIXED_ARRAY);
writer.writeVInt(nodeIn.numArcs);
// placeholder -- we'll come back and write the number
// of bytes per arc (int) here:
// TODO: we could make this a vInt instead
writer.writeInt(0);
fixedArrayStart = writer.getPosition();
{code}

I think we should actually make that TODO line a writeByte.

If it turns out the max arcSize is  255 i think we should just not encode as 
array arcs (just save our position before we write ARCS_AS_FIXED_ARRAY, rewind 
to that, and encode normally)

This would reduce the overhead of array-arcs, but also maybe prevent some worst 
cases causing waste as a side effect.


 Reduce wasted bytes in FST due to array arcs
 

 Key: LUCENE-4682
 URL: https://issues.apache.org/jira/browse/LUCENE-4682
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Priority: Minor
 Attachments: kuromoji.wasted.bytes.txt


 When a node is close to the root, or it has many outgoing arcs, the FST 
 writes the arcs as an array (each arc gets N bytes), so we can e.g. bin 
 search on lookup.
 The problem is N is set to the max(numBytesPerArc), so if you have an outlier 
 arc e.g. with a big output, you can waste many bytes for all the other arcs 
 that didn't need so many bytes.
 I generated Kuromoji's FST and found it has 271187 wasted bytes vs total size 
 1535612 = ~18% wasted.
 It would be nice to reduce this.
 One thing we could do without packing is: in addNode, if we detect that 
 number of wasted bytes is above some threshold, then don't do the expansion.
 Another thing, if we are packing: we could record stats in the first pass 
 about which nodes wasted the most, and then in the second pass (paack) we 
 could set the threshold based on the top X% nodes that waste ...
 Another idea is maybe to deref large outputs, so that the numBytesPerArc is 
 more uniform ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4678) FST should use paged byte[] instead of single contiguous byte[]

[
https://issues.apache.org/jira/browse/LUCENE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551941#comment-13551941
]

Michael McCandless commented on LUCENE-4678:

bq. the copyBytes is frightening though!

I know! But hopefully the random test catches any problems w/ it ... jenkins
will tell us.

bq. So if we kept it as FST.BytesReader they would be largely unaffected?

+1, I moved back to that ... no more noise ... I'll attach new patch shortly.

FST should use paged byte[] instead of single contiguous byte[]
---

Attachments: LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4678) FST should use paged byte[] instead of single contiguous byte[]


 [ 
https://issues.apache.org/jira/browse/LUCENE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4678:
---

Attachment: LUCENE-4678.patch

New patch, move BytesReader back under FST.  I think it's ready.

 FST should use paged byte[] instead of single contiguous byte[]
 ---

 Key: LUCENE-4678
 URL: https://issues.apache.org/jira/browse/LUCENE-4678
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch, 
 LUCENE-4678.patch


 The single byte[] we use today has several limitations, eg it limits us to  
 2.1 GB FSTs (and suggesters in the wild are getting close to this limit), and 
 it causes big RAM spikes during building when a the array has to grow.
 I took basically the same approach as LUCENE-3298, but I want to break out 
 this patch separately from changing all int - long for  2.1 GB support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

2013-01-12 Thread Dawid Weiss (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551944#comment-13551944
]

Dawid Weiss commented on LUCENE-4682:
-

bq. Even more than the 271,187 I measured (20% smaller FST), I think because
the FST is now smaller we use fewer bytes writing the delta-coded node
addresses

Yes, these things are all tightly coupled.

Dawid

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4678) FST should use paged byte[] instead of single contiguous byte[]


[ 
https://issues.apache.org/jira/browse/LUCENE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551947#comment-13551947
 ] 

Commit Tag Bot commented on LUCENE-4678:


[trunk commit] Michael McCandless
http://svn.apache.org/viewvc?view=revisionrevision=1432459

LUCENE-4678: use paged byte[] under the hood for FST


 FST should use paged byte[] instead of single contiguous byte[]
 ---

 Key: LUCENE-4678
 URL: https://issues.apache.org/jira/browse/LUCENE-4678
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch, 
 LUCENE-4678.patch


 The single byte[] we use today has several limitations, eg it limits us to  
 2.1 GB FSTs (and suggesters in the wild are getting close to this limit), and 
 it causes big RAM spikes during building when a the array has to grow.
 I took basically the same approach as LUCENE-3298, but I want to break out 
 this patch separately from changing all int - long for  2.1 GB support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551950#comment-13551950
]

Michael McCandless commented on LUCENE-4682:

Another datapoint: the FreeDB suggester (tool in luceneutil to create/test it)
is 1.05 GB FST, and has 87.5 MB wasted bytes (~8%).

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4677) Use vInt to encode node addresses inside FST


[ 
https://issues.apache.org/jira/browse/LUCENE-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551954#comment-13551954
 ] 

Commit Tag Bot commented on LUCENE-4677:


[trunk commit] Michael McCandless
http://svn.apache.org/viewvc?view=revisionrevision=1432466

LUCENE-4677: use vInt not int to encode arc's target address in un-packed FSTs


 Use vInt to encode node addresses inside FST
 

 Key: LUCENE-4677
 URL: https://issues.apache.org/jira/browse/LUCENE-4677
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4677.patch, LUCENE-4677.patch, LUCENE-4677.patch


 Today we use int, but towards enabling  2.1G sized FSTs, I'd like to make 
 this vInt instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4287) Maven artifact file names do not match dist/ file names


[ 
https://issues.apache.org/jira/browse/SOLR-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551975#comment-13551975
 ] 

Commit Tag Bot commented on SOLR-4287:
--

[trunk commit] Steven Rowe
http://svn.apache.org/viewvc?view=revisionrevision=1432483

SOLR-4287: Removed apache- prefix from Solr distribution and artifact 
filenames.


 Maven artifact file names do not match dist/ file names
 ---

 Key: SOLR-4287
 URL: https://issues.apache.org/jira/browse/SOLR-4287
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Ryan Ernst
Assignee: Steve Rowe
Priority: Blocker
 Fix For: 4.1

 Attachments: SOLR-4287_alternative.patch, SOLR-4287.patch


 For the solr artifact, the war file name has the format solr-X.Y.Z.war.
 http://search.maven.org/#artifactdetails%7Corg.apache.solr%7Csolr%7C4.0.0%7Cwar
 However, when building from source or downloading the dist/ built war file, 
 it is named apache-solr-X.Y.Z.war.  This should really be the same...
 Preferably the apache- could just be removed, since the lucene build does 
 not appear to use the same convention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4287) Maven artifact file names do not match dist/ file names


[ 
https://issues.apache.org/jira/browse/SOLR-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551980#comment-13551980
 ] 

Commit Tag Bot commented on SOLR-4287:
--

[branch_4x commit] Steven Rowe
http://svn.apache.org/viewvc?view=revisionrevision=1432486

SOLR-4287: Removed apache- prefix from Solr distribution and artifact 
filenames. (merged trunk r1432483)


 Maven artifact file names do not match dist/ file names
 ---

 Key: SOLR-4287
 URL: https://issues.apache.org/jira/browse/SOLR-4287
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Ryan Ernst
Assignee: Steve Rowe
Priority: Blocker
 Fix For: 4.1

 Attachments: SOLR-4287_alternative.patch, SOLR-4287.patch


 For the solr artifact, the war file name has the format solr-X.Y.Z.war.
 http://search.maven.org/#artifactdetails%7Corg.apache.solr%7Csolr%7C4.0.0%7Cwar
 However, when building from source or downloading the dist/ built war file, 
 it is named apache-solr-X.Y.Z.war.  This should really be the same...
 Preferably the apache- could just be removed, since the lucene build does 
 not appear to use the same convention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

lucene-solr pull request: Fix API web link for IndexDeletionPolicy (against...

2013-01-12 Thread arafalov

GitHub user arafalov opened a pull request:

https://github.com/apache/lucene-solr/pull/6

Fix API web link for IndexDeletionPolicy (against solr 4.x branch)

I did it for Lucene 4.0, as I am not sure where 4.1 will live.
In any case, this is better than currently-dead 3.5 link.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arafalov/lucene-solr patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/6.patch


commit 5991153f26cc92c7cc5c95d6a1774eb3050b0643
Author: Alexandre Rafalovitch arafa...@gmail.com
Date:   2013-01-12T17:57:20Z

Fix API web link for IndexDeletionPolicy

I did it for Lucene 4.0, as I am not sure where 4.1 will live.
In any case, this is better than currently-dead 3.5 link.




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

lucene-solr pull request: Trivial documentation URL fix

2013-01-12 Thread arafalov

Github user arafalov closed the pull request at:

https://github.com/apache/lucene-solr/pull/5


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4287) Maven artifact file names do not match dist/ file names


 [ 
https://issues.apache.org/jira/browse/SOLR-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved SOLR-4287.
--

Resolution: Fixed

Committed to trunk and branch_4x.

Thanks Ryan!

 Maven artifact file names do not match dist/ file names
 ---

 Key: SOLR-4287
 URL: https://issues.apache.org/jira/browse/SOLR-4287
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.0
Reporter: Ryan Ernst
Assignee: Steve Rowe
Priority: Blocker
 Fix For: 4.1

 Attachments: SOLR-4287_alternative.patch, SOLR-4287.patch


 For the solr artifact, the war file name has the format solr-X.Y.Z.war.
 http://search.maven.org/#artifactdetails%7Corg.apache.solr%7Csolr%7C4.0.0%7Cwar
 However, when building from source or downloading the dist/ built war file, 
 it is named apache-solr-X.Y.Z.war.  This should really be the same...
 Preferably the apache- could just be removed, since the lucene build does 
 not appear to use the same convention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #212: POMs out of sync

2013-01-12 Thread Steve Rowe

The POMs really are out of sync:

-
-validate-maven-dependencies:
 [licenses] MISSING sha1 checksum file for: 
/home/hudson/.m2/repository/org/apache/velocity/velocity/1.6.4/velocity-1.6.4.jar
 [licenses] Scanned 32 JAR file(s) for licenses (in 0.14s.), 1 error(s).
-

I'll make an adjustment shortly.

(I should also fix the log trimming regex for the Maven Jenkins jobs so that 
this error makes it into future failure emails.)

Steve

On Jan 12, 2013, at 12:06 PM, Apache Jenkins Server jenk...@builds.apache.org 
wrote:

 Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/212/
 
 No tests ran.
 
 Build Log:
 [...truncated 11125 lines...]
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1028) Automatic core loading unloading for multicore


[ 
https://issues.apache.org/jira/browse/SOLR-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13551991#comment-13551991
 ] 

Steve Rowe commented on SOLR-1028:
--

Erick, can this issue be resolved?

 Automatic core loading unloading for multicore
 --

 Key: SOLR-1028
 URL: https://issues.apache.org/jira/browse/SOLR-1028
 Project: Solr
  Issue Type: New Feature
  Components: multicore
Affects Versions: 4.0, 5.0
Reporter: Noble Paul
Assignee: Erick Erickson
 Fix For: 4.1, 5.0

 Attachments: jenkins.jpg, SOLR-1028.patch, SOLR-1028.patch, 
 SOLR-1028_testnoise.patch


 usecase: I have many small cores (say one per user) on a single Solr box . 
 All the cores are not be always needed . But when I need it I should be able 
 to directly issue a search request and the core must be STARTED automatically 
 and the request must be served.
 This also requires that I must have an upper limit on the no:of cores that 
 should be loaded at any given point in time. If the limit is crossed the 
 CoreContainer must unload a core (preferably the least recently used core)  
 There must be a choice of specifying some cores as fixed. These cores must 
 never be unloaded 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4299) Failed with java.net.BindException Address already in use

2013-01-12 Thread Nithin Chacko Ninan (JIRA)

Nithin Chacko Ninan created SOLR-4299:
-

 Summary: Failed with java.net.BindException Address already in use
 Key: SOLR-4299
 URL: https://issues.apache.org/jira/browse/SOLR-4299
 Project: Solr
  Issue Type: Bug
Reporter: Nithin Chacko Ninan


Hello Team,

We have configured magetno solr search on our stage instance.While testing, we 
noticed that solr is not working as expected.we searched on solr confgiuration 
and we used java -jar start.jar to check the port status. we noticed the 
above mentioned issue (ie failed with java.net.BindException Address already in 
use).
Any comment or help will be appriciated.

thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-2305) Introduce Version in more places long before 4.0

2013-01-12 Thread Shai Erera (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera resolved LUCENE-2305.

Resolution: Won't Fix
Fix Version/s: (was: 4.2)
(was: 5.0)

4.0 is out long ago :).
And I don't think we need that issue if we want to add Version to more places.

Introduce Version in more places long before 4.0

Key: LUCENE-2305
URL: https://issues.apache.org/jira/browse/LUCENE-2305
Project: Lucene - Core
Issue Type: Improvement
Components: core/other
Reporter: Shai Erera

We need to introduce Version in as many places as we can (wherever it makes
sense of course), and preferably long before 4.0 (or shall I say 3.9?) is
out. That way, we can have a bunch of deprecated API now, that will be gone
in 4.0, rather than doing it one class at a time and never finish :).
The purpose is to introduce Version wherever it is mandatory now, and also in
places where we think it might be useful in the future (like most of our
Analyzers, configured classes and configuration classes).
I marked this issue for 3.1, though I don't expect it to end in 3.1. I still
think it will be done one step at a time, perhaps for cluster of classes
together. But on the other hand I don't want to mark it for 4.0.0 because
that needs to be resolved much sooner. So if I had a 3.9 version defined, I'd
mark it for 3.9. We can do several commits in one issue right? So this one
can live for a while in JIRA, while we gradually convert more and more
classes.
The first candidate is InstantiatedIndexWriter which probably should take an
IndexWriterConfig. While I converted the code to use IWC, I've noticed
Instantiated defaults its maxFieldLength to the current default (10,000)
which is deprecated. I couldn't change it for back-compat reasons. But we can
upgrade it to accept IWC, and set to unlimited if the version is onOrAfter
3.1, otherwise stay w/ the deprecated default.
if it's acceptable to have several commits in one issue, I can start w/
Instantiated, post a patch and then we can continue to more classes.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4299) Failed with java.net.BindException Address already in use

2013-01-12 Thread Nithin Chacko Ninan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nithin Chacko Ninan updated SOLR-4299:
--

Description: 
Hello Team,

We have configured magetno solr search on our stage instance.While testing, we 
noticed that solr is not working as expected.we searched on solr confgiuration 
and we used java -jar start.jar to check the port status. we noticed the 
below mentioned issue (ie failed with java.net.BindException Address already in 
use).
Any comment or help will be appreciated.



NFO: [] Registered new searcher Searcher@668db25b main
2013-01-12 19:36:50.223:WARN::failed SocketConnector@0.0.0.0:8983: 
java.net.BindException: Address already in use
2013-01-12 19:36:50.223:WARN::failed Server@7ca7700a: java.net.BindException: 
Address already in use
2013-01-12 19:36:50.223:WARN::EXCEPTION 
java.net.BindException: Address already in use
at java.net.PlainSocketImpl.socketBind(Native Method)
at 
java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
at java.net.ServerSocket.bind(ServerSocket.java:376)
at java.net.ServerSocket.init(ServerSocket.java:237)
at java.net.ServerSocket.init(ServerSocket.java:181)
at 
org.mortbay.jetty.bio.SocketConnector.newServerSocket(SocketConnector.java:80)
at org.mortbay.jetty.bio.SocketConnector.open(SocketConnector.java:73)
at 
org.mortbay.jetty.AbstractConnector.doStart(AbstractConnector.java:283)
at 
org.mortbay.jetty.bio.SocketConnector.doStart(SocketConnector.java:147)
at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.jetty.Server.doStart(Server.java:235)
at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.mortbay.start.Main.invokeMain(Main.java:194)
at org.mortbay.start.Main.start(Main.java:534)
at org.mortbay.start.Main.start(Main.java:441)
at org.mortbay.start.Main.main(Main.java:119)
thanks!

  was:
Hello Team,

We have configured magetno solr search on our stage instance.While testing, we 
noticed that solr is not working as expected.we searched on solr confgiuration 
and we used java -jar start.jar to check the port status. we noticed the 
above mentioned issue (ie failed with java.net.BindException Address already in 
use).
Any comment or help will be appriciated.

thanks!


 Failed with java.net.BindException Address already in use
 -

 Key: SOLR-4299
 URL: https://issues.apache.org/jira/browse/SOLR-4299
 Project: Solr
  Issue Type: Bug
Reporter: Nithin Chacko Ninan

 Hello Team,
 We have configured magetno solr search on our stage instance.While testing, 
 we noticed that solr is not working as expected.we searched on solr 
 confgiuration and we used java -jar start.jar to check the port status. we 
 noticed the below mentioned issue (ie failed with java.net.BindException 
 Address already in use).
 Any comment or help will be appreciated.
 NFO: [] Registered new searcher Searcher@668db25b main
 2013-01-12 19:36:50.223:WARN::failed SocketConnector@0.0.0.0:8983: 
 java.net.BindException: Address already in use
 2013-01-12 19:36:50.223:WARN::failed Server@7ca7700a: java.net.BindException: 
 Address already in use
 2013-01-12 19:36:50.223:WARN::EXCEPTION 
 java.net.BindException: Address already in use
   at java.net.PlainSocketImpl.socketBind(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
   at java.net.ServerSocket.bind(ServerSocket.java:376)
   at java.net.ServerSocket.init(ServerSocket.java:237)
   at java.net.ServerSocket.init(ServerSocket.java:181)
   at 
 org.mortbay.jetty.bio.SocketConnector.newServerSocket(SocketConnector.java:80)
   at org.mortbay.jetty.bio.SocketConnector.open(SocketConnector.java:73)
   at 
 org.mortbay.jetty.AbstractConnector.doStart(AbstractConnector.java:283)
   at 
 org.mortbay.jetty.bio.SocketConnector.doStart(SocketConnector.java:147)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at org.mortbay.jetty.Server.doStart(Server.java:235)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at

[jira] [Resolved] (SOLR-4299) Failed with java.net.BindException Address already in use


 [ 
https://issues.apache.org/jira/browse/SOLR-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved SOLR-4299.
--

Resolution: Invalid
  Assignee: Steve Rowe

Please post questions about using Solr to the solr-user mailing list, rather 
than creating JIRA issues - see 
[http://lucene.apache.org/solr/discussion.html].

You might like the following, which I found by searching the interweb:

* http://stackoverflow.com/questions/6645253/solr-configuration
* 
http://javarevisited.blogspot.com/2011/12/address-already-use-jvm-bind-exception.html

 Failed with java.net.BindException Address already in use
 -

 Key: SOLR-4299
 URL: https://issues.apache.org/jira/browse/SOLR-4299
 Project: Solr
  Issue Type: Bug
Reporter: Nithin Chacko Ninan
Assignee: Steve Rowe

 Hello Team,
 We have configured magetno solr search on our stage instance.While testing, 
 we noticed that solr is not working as expected.we searched on solr 
 confgiuration and we used java -jar start.jar to check the port status. we 
 noticed the below mentioned issue (ie failed with java.net.BindException 
 Address already in use).
 Any comment or help will be appreciated.
 NFO: [] Registered new searcher Searcher@668db25b main
 2013-01-12 19:36:50.223:WARN::failed SocketConnector@0.0.0.0:8983: 
 java.net.BindException: Address already in use
 2013-01-12 19:36:50.223:WARN::failed Server@7ca7700a: java.net.BindException: 
 Address already in use
 2013-01-12 19:36:50.223:WARN::EXCEPTION 
 java.net.BindException: Address already in use
   at java.net.PlainSocketImpl.socketBind(Native Method)
   at 
 java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:376)
   at java.net.ServerSocket.bind(ServerSocket.java:376)
   at java.net.ServerSocket.init(ServerSocket.java:237)
   at java.net.ServerSocket.init(ServerSocket.java:181)
   at 
 org.mortbay.jetty.bio.SocketConnector.newServerSocket(SocketConnector.java:80)
   at org.mortbay.jetty.bio.SocketConnector.open(SocketConnector.java:73)
   at 
 org.mortbay.jetty.AbstractConnector.doStart(AbstractConnector.java:283)
   at 
 org.mortbay.jetty.bio.SocketConnector.doStart(SocketConnector.java:147)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at org.mortbay.jetty.Server.doStart(Server.java:235)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:601)
   at org.mortbay.start.Main.invokeMain(Main.java:194)
   at org.mortbay.start.Main.start(Main.java:534)
   at org.mortbay.start.Main.start(Main.java:441)
   at org.mortbay.start.Main.main(Main.java:119)
 thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3735) Relocate the example mime-to-extension mapping


[ 
https://issues.apache.org/jira/browse/SOLR-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552040#comment-13552040
 ] 

Commit Tag Bot commented on SOLR-3735:
--

[branch_4x commit] Steven Rowe
http://svn.apache.org/viewvc?view=revisionrevision=1432501

SOLR-3735: Maven configuration: upgrade velocity dependency from 1.6.4 to 1.7


 Relocate the example mime-to-extension mapping
 --

 Key: SOLR-3735
 URL: https://issues.apache.org/jira/browse/SOLR-3735
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Affects Versions: 4.0-BETA, 4.0
Reporter: Erik Hatcher
Assignee: Erik Hatcher
Priority: Minor
 Fix For: 4.1, 5.0

 Attachments: SOLR-3735.patch


 A mime-to-extension mapping was added to VelocityResponseWriter recently.  
 This really belongs in the templates themselves, not in VrW, as it is 
 specific to the example search results not meant for all VrW templates.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir updated LUCENE-4682:

Attachment: LUCENE-4682.patch

Mike can you try this patch on your corpus?

It cuts us over to vint for the maxBytesPerArc (saving 3 bytes for the unpacked
case), and adds an acceptable overhead for array arcs (currently 1.25).

For the kuromoji packed case, this seems to solve the waste:

[java] 53645 nodes, 253185 arcs, 1309077 bytes... done

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552057#comment-13552057
]

Michael McCandless commented on LUCENE-4682:

This is much cleaner (write header in the end).

I built the AnalyzingSuggester for FreeDB: trunk is 1.046 GB and with patch
it's 0.917 GB = ~9% smaller!

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs


[ 
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552061#comment-13552061
 ] 

Robert Muir commented on LUCENE-4682:
-

I can cleanup+commit the patch with the heuristic commented out (so we still 
get the cutover to vint, which i think is an obvious win?)

This way we can benchmark and make sure the heuristic is set 
appropriately/doesnt hurt performance?

 Reduce wasted bytes in FST due to array arcs
 

 Key: LUCENE-4682
 URL: https://issues.apache.org/jira/browse/LUCENE-4682
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Priority: Minor
 Attachments: kuromoji.wasted.bytes.txt, LUCENE-4682.patch


 When a node is close to the root, or it has many outgoing arcs, the FST 
 writes the arcs as an array (each arc gets N bytes), so we can e.g. bin 
 search on lookup.
 The problem is N is set to the max(numBytesPerArc), so if you have an outlier 
 arc e.g. with a big output, you can waste many bytes for all the other arcs 
 that didn't need so many bytes.
 I generated Kuromoji's FST and found it has 271187 wasted bytes vs total size 
 1535612 = ~18% wasted.
 It would be nice to reduce this.
 One thing we could do without packing is: in addNode, if we detect that 
 number of wasted bytes is above some threshold, then don't do the expansion.
 Another thing, if we are packing: we could record stats in the first pass 
 about which nodes wasted the most, and then in the second pass (paack) we 
 could set the threshold based on the top X% nodes that waste ...
 Another idea is maybe to deref large outputs, so that the numBytesPerArc is 
 more uniform ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552063#comment-13552063
]

Michael McCandless commented on LUCENE-4682:

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

2013-01-12 Thread Dawid Weiss (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552065#comment-13552065
]

Dawid Weiss commented on LUCENE-4682:
-

+1. Nice.

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

2013-01-12 Thread Uwe Schindler (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552067#comment-13552067
]

Uwe Schindler commented on LUCENE-4682:
---

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552071#comment-13552071
]

Robert Muir commented on LUCENE-4682:
-

ok i committed the vInt for maxBytesPerArc, but left out the heuristic (so we
still have the waste!!!)

Here's the comment i added:
{code}
// TODO: try to avoid wasteful cases: disable doFixedArray in that case
/*
*
* LUCENE-4682: what is a fair heuristic here?
* It could involve some of these:
* 1. how busy the node is: nodeIn.inputCount relative to
frontier[0].inputCount?
* 2. how much binSearch saves over scan: nodeIn.numArcs
* 3. waste: numBytes vs numBytesExpanded
*
* the one below just looks at #3
if (doFixedArray) {
// rough heuristic: make this 1.25 waste factor a parameter to the phd
ctor
int numBytes = lastArcStart - startAddress;
int numBytesExpanded = maxBytesPerArc * nodeIn.numArcs;
if (numBytesExpanded numBytes*1.25) {
doFixedArray = false;
}
}
*/
{code}

I think it would just be best to do some performance benchmarks and figure this
out.
I know all the kuromoji waste is at node.depth=1 exactly.

Also I indexed all of geonames with this heuristic and it barely changed the
FST size:

trunk
FST: 45296685
packedFST: 39083451
vint maxBytesPerArc:
FST: 45052386
packedFST: 39083451
vint maxBytesPerArc+heuristic:
FST: 44988400
packedFST: 39029108

So the waste and heuristic doesn't affect all FSTs, only certain ones.

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552072#comment-13552072
]

Commit Tag Bot commented on LUCENE-4682:

[trunk commit] Robert Muir
http://svn.apache.org/viewvc?view=revisionrevision=1432522

LUCENE-4682: vInt-encode maxBytesPerArc

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4678) FST should use paged byte[] instead of single contiguous byte[]


 [ 
https://issues.apache.org/jira/browse/LUCENE-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4678:
---

Attachment: LUCENE-4678.patch

Patch, fixing FST.pack to not double-buffer again, using the new 
BytesStore.truncate method to roll back the last N bytes ...

 FST should use paged byte[] instead of single contiguous byte[]
 ---

 Key: LUCENE-4678
 URL: https://issues.apache.org/jira/browse/LUCENE-4678
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4678.patch, LUCENE-4678.patch, LUCENE-4678.patch, 
 LUCENE-4678.patch, LUCENE-4678.patch


 The single byte[] we use today has several limitations, eg it limits us to  
 2.1 GB FSTs (and suggesters in the wild are getting close to this limit), and 
 it causes big RAM spikes during building when a the array has to grow.
 I took basically the same approach as LUCENE-3298, but I want to break out 
 this patch separately from changing all int - long for  2.1 GB support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4682) Reduce wasted bytes in FST due to array arcs

[
https://issues.apache.org/jira/browse/LUCENE-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552075#comment-13552075
]

Robert Muir commented on LUCENE-4682:
-

Another simple idea: instead of boolean allowArrayArcs we just make this a
float: acceptableArrayArcOverhead (or maybe a better name).

you would pass 0 to disable array arcs completely (and we'd internally still
have our boolean allowArrayArcs and not waste
time computing stuff if this is actually = 0)

Reduce wasted bytes in FST due to array arcs

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4417) Re-Add the backwards compatibility tests to 4.1 branch


 [ 
https://issues.apache.org/jira/browse/LUCENE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-4417:
---

Priority: Blocker  (was: Major)

We shouldn't release 4.1 until at least lucene-core backwards tests are 
re-enabled.

 Re-Add the backwards compatibility tests to 4.1 branch
 --

 Key: LUCENE-4417
 URL: https://issues.apache.org/jira/browse/LUCENE-4417
 Project: Lucene - Core
  Issue Type: Task
  Components: general/test
Reporter: Uwe Schindler
Assignee: Uwe Schindler
Priority: Blocker
 Fix For: 4.1


 In 4.0 we have no backwards compatibility, but in 4.1 we must again 
 ivy-retrieve the 4.0 JAR file and run the core tests again (like in 3.6). We 
 may think about other modules, too, so all modules that must be backwards 
 compatible should be added to this build.
 I will work on this once we have a release candidate in Maven Central.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-2125) Ability to store and retrieve attributes in the inverted index

[
https://issues.apache.org/jira/browse/LUCENE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steve Rowe updated LUCENE-2125:
---

Fix Version/s: (was: 4.1)
4.2

Ability to store and retrieve attributes in the inverted index
--

Key: LUCENE-2125
URL: https://issues.apache.org/jira/browse/LUCENE-2125
Project: Lucene - Core
Issue Type: New Feature
Components: core/index
Affects Versions: 4.0-ALPHA
Reporter: Michael Busch
Assignee: Michael Busch
Priority: Minor
Fix For: 4.2

Now that we have the cool attribute-based TokenStream API and also the
great new flexible indexing features, the next logical step is to
allow storing the attributes inline in the posting lists. Currently
this is only supported for the PayloadAttribute.
The flex search APIs already provide an AttributeSource, so there will
be a very clean and performant symmetry. It should be seamlessly
possible for the user to define a new attribute, add it to the
TokenStream, and then retrieve it from the flex search APIs.
What I'm planning to do is to add additional methods to the token
attributes (e.g. by adding a new class TokenAttributeImpl, which
extends AttributeImpl and is the super class of all impls in
o.a.l.a.tokenattributes):
- void serialize(DataOutput)
- void deserialize(DataInput)
- boolean storeInIndex()
The indexer will only call the serialize method of an
TokenAttributeImpl in case its storeInIndex() returns true.
The big advantage here is the ease-of-use: A user can implement in one
place everything necessary to add the attribute to the index.
Btw: I'd like to introduce DataOutput and DataInput as super classes
of IndexOutput and IndexInput. They will contain methods like
readByte(), readVInt(), etc., but methods such as close(),
getFilePointer() etc. will stay in the super classes.
Currently the payload concept is hardcoded in
TermsHashPerField and FreqProxTermsWriterPerField. These classes take
care of copying the contents of the PayloadAttribute over into the
intermediate in-memory postinglist representation and reading it
again. Ideally these classes should not know about specific
attributes, but only call serialze() on those attributes that shall
be stored in the posting list.
We also need to change the PositionsEnum and PositionsConsumer APIs to
deal with attributes instead of payloads.
I think the new codecs should all support storing attributes. Only the
preflex one should be hardcoded to only take the PayloadAttribute into
account.
We'll possibly need another extension point that allows us to influence
compression across multiple postings. Today we use the
length-compression trick for the payloads: if the previous payload had
the same length as the current one, we don't store the length
explicitly again, but only set a bit in the shifted position VInt. Since
often all payloads of one posting list have the same length, this
results in effective compression.
Now an advanced user might want to implement a similar encoding, where
it's not enough to just control serialization of a single value, but
where e.g. the previous position can be taken into account to decide
how to encode a value.
I'm not sure yet how this extension point should look like. Maybe the
flex APIs are actually already sufficient.
One major goal of this feature is performance: It ought to be more
efficient to e.g. define an attribute that writes and reads a single
VInt than storing that VInt as a payload. The payload has the overhead
of converting the data into a byte array first. An attribute on the other
hand should be able to call 'int value = dataInput.readVInt();' directly
without the byte[] indirection.
After this part is done I'd like to use a very similar approach for
column-stride fields.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-1743) MMapDirectory should only mmap large files, small files should be opened using SimpleFS/NIOFS


 [ 
https://issues.apache.org/jira/browse/LUCENE-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-1743:
---

Fix Version/s: (was: 4.1)
   4.2

 MMapDirectory should only mmap large files, small files should be opened 
 using SimpleFS/NIOFS
 -

 Key: LUCENE-1743
 URL: https://issues.apache.org/jira/browse/LUCENE-1743
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/store
Affects Versions: 2.9
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 4.2


 This is a followup to LUCENE-1741:
 Javadocs state (in FileChannel#map): For most operating systems, mapping a 
 file into memory is more expensive than reading or writing a few tens of 
 kilobytes of data via the usual read and write methods. From the standpoint 
 of performance it is generally only worth mapping relatively large files into 
 memory.
 MMapDirectory should get a user-configureable size parameter that is a lower 
 limit for mmapping files. All files with a sizelimit should be opened using 
 a conventional IndexInput from SimpleFS or NIO (another configuration option 
 for the fallback?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4246) Fix IndexWriter.close() to not commit or wait for pending merges


[ 
https://issues.apache.org/jira/browse/LUCENE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552092#comment-13552092
 ] 

Steve Rowe commented on LUCENE-4246:


I'd like to push this to 4.2.  Any objections?

 Fix IndexWriter.close() to not commit or wait for pending merges
 

 Key: LUCENE-4246
 URL: https://issues.apache.org/jira/browse/LUCENE-4246
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-1689) supplementary character handling

[
https://issues.apache.org/jira/browse/LUCENE-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steve Rowe resolved LUCENE-1689.

Resolution: Fixed
Fix Version/s: (was: 4.2)
(was: 5.0)

Resolving. Any remaining problems can be opened as separate issues.

supplementary character handling

Key: LUCENE-1689
URL: https://issues.apache.org/jira/browse/LUCENE-1689
Project: Lucene - Core
Issue Type: Improvement
Components: modules/analysis
Reporter: Robert Muir
Priority: Minor
Attachments: LUCENE-1689_lowercase_example.txt, LUCENE-1689.patch,
LUCENE-1689.patch, LUCENE-1689.patch, testCurrentBehavior.txt

for Java 5. Java 5 is based on unicode 4, which means variable-width encoding.
supplementary character support should be fixed for code that works with
char/char[]
For example:
StandardAnalyzer, SimpleAnalyzer, StopAnalyzer, etc should at least be
changed so they don't actually remove suppl characters, or modified to look
for surrogates and behave correctly.
LowercaseFilter should be modified to lowercase suppl. characters correctly.
CharTokenizer should either be deprecated or changed so that isTokenChar()
and normalize() use int.
in all of these cases code should remain optimized for the BMP case, and
suppl characters should be the exception, but still work.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3380) enable FileSwitchDirectory randomly in tests and fix compound-file/NoSuchDirectoryException bugs


 [ 
https://issues.apache.org/jira/browse/LUCENE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-3380:
---

Fix Version/s: (was: 4.1)
   4.2

 enable FileSwitchDirectory randomly in tests and fix 
 compound-file/NoSuchDirectoryException bugs
 

 Key: LUCENE-3380
 URL: https://issues.apache.org/jira/browse/LUCENE-3380
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.2

 Attachments: LUCENE-3380.patch


 Looks like FileSwitchDirectory has the same bugs in it as LUCENE-3374.
 We should randomly enable this guy in tests and flush them all out the same 
 way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary


 [ 
https://issues.apache.org/jira/browse/LUCENE-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-3888:
---

Fix Version/s: (was: 4.1)
   4.2

 split off the spell check word and surface form in spell check dictionary
 -

 Key: LUCENE-3888
 URL: https://issues.apache.org/jira/browse/LUCENE-3888
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Koji Sekiguchi
Assignee: Koji Sekiguchi
Priority: Minor
 Fix For: 4.2

 Attachments: LUCENE-3888.patch, LUCENE-3888.patch, LUCENE-3888.patch, 
 LUCENE-3888.patch, LUCENE-3888.patch, LUCENE-3888.patch


 The did you mean? feature by using Lucene's spell checker cannot work well 
 for Japanese environment unfortunately and is the longstanding problem, 
 because the logic needs comparatively long text to check spells, but for some 
 languages (e.g. Japanese), most words are too short to use the spell checker.
 I think, for at least Japanese, the things can be improved if we split off 
 the spell check word and surface form in the spell check dictionary. Then we 
 can use ReadingAttribute for spell checking but CharTermAttribute for 
 suggesting, for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3298) FST has hard limit max size of 2.1 GB


 [ 
https://issues.apache.org/jira/browse/LUCENE-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3298:
---

Attachment: LUCENE-3298.patch

Initial patch with int - long in lots of places ... the Test2BFST is still 
running ...

 FST has hard limit max size of 2.1 GB
 -

 Key: LUCENE-3298
 URL: https://issues.apache.org/jira/browse/LUCENE-3298
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Attachments: LUCENE-3298.patch, LUCENE-3298.patch, LUCENE-3298.patch


 The FST uses a single contiguous byte[] under the hood, which in java is 
 indexed by int so we cannot grow this over Integer.MAX_VALUE.  It also 
 internally encodes references to this array as vInt.
 We could switch this to a paged byte[] and make the far larger.
 But I think this is low priority... I'm not going to work on it any time soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4217) post.jar ignores -Dparams when -Durl is used

2013-01-12 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13552103#comment-13552103
 ] 

Alexandre Rafalovitch commented on SOLR-4217:
-

Would it be possible to fit this into 4.1? I am trying to use this for an 
example and it is very clunky with the current workaround:
java -Dauto 
-Durl=http://localhost:8983/solr/multivalued/update?f.to.split=truef.to.separator=;;
 -jar post.jar multivalued/multivalued.csv 

The example should be out after 4.1, but it will not wait until 4.2

The change should be trivial, something like:
-
urlStr = System.getProperty(url)
if (urlStr == null)
{
  urlStr = SimplePostTool.appendParam(DEFAULT_POST_URL, params);
}
else
{
  urlStr = SimplePostTool.appendParam(urlStr, params);
}
-

I just don't have the environment setup to do full patch myself yet.

 post.jar ignores -Dparams when -Durl is used
 

 Key: SOLR-4217
 URL: https://issues.apache.org/jira/browse/SOLR-4217
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 4.0
Reporter: Alexandre Rafalovitch
Priority: Minor
 Fix For: 4.2, 5.0


 When post.jar is used with a custom URL (e.g. for multi-core), it silently 
 ignores -Dparams flag and requires parameters to be appended directly to 
 -Durl value.
 The problem is the following code:
 String params = System.getProperty(params, );
 urlStr = System.getProperty(url, 
 SimplePostTool.appendParam(DEFAULT_POST_URL, params));
 The workaround exists (by using 
 -Durl=http://customurl?param1=valueparam2=value;), but it is both 
 undocumented as a special case and clunky as Url and params may be coming 
 from different places. It would be good to have this consistent.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #213: POMs out of sync

2013-01-12 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/213/

1 tests failed.
FAILED:  org.apache.solr.cloud.SyncSliceTest.testDistribSearch

Error Message:
shard1 should have just been set up to be inconsistent - but it's still 
consistent

Stack Trace:
java.lang.AssertionError: shard1 should have just been set up to be 
inconsistent - but it's still consistent
at 
__randomizedtesting.SeedInfo.seed([400C776269C4BF8E:C1EAF97A1E9BDFB2]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNotNull(Assert.java:526)
at org.apache.solr.cloud.SyncSliceTest.doTest(SyncSliceTest.java:214)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:794)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)

[jira] [Commented] (LUCENE-4417) Re-Add the backwards compatibility tests to 4.1 branch