date:20130729


 [ 
https://issues.apache.org/jira/browse/LUCENE-5141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5141:
-

Fix Version/s: 4.5
   5.0

 CheckIndex.fixIndex doesn't need a Codec
 

 Key: LUCENE-5141
 URL: https://issues.apache.org/jira/browse/LUCENE-5141
 Project: Lucene - Core
  Issue Type: Task
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5141.patch


 CheckIndex.fixIndex takes a codec as an argument although it doesn't need one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


 [ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4985:
---

Attachment: LUCENE-4985.patch

Patch addresses the following:

* Added FacetRequest.createFacetsAggregator(FacetIndexingParams). All requests 
implement it except RangeFacetRequest which returns null. The method is 
abstract and documents that you are allowed return null.

* TaxonomyFacetsAccumulator: if a FacetRequest returns null from 
createFacetsAggregator, it throws an exception. Otherwise, it groups the 
requests into category lists as well as ensures that categories are not over 
counted. It uses MultiFacetsAggregator (new) and PerCategoryListAggregator 
(existing) to achieve that.
** That allows passing a combination of requests, e.g. Count(A), Count(B), 
Count(C), SumScore(A), SumScore(F), SumIntAssociation(D)... and works correctly 
when e.g. A+B were indexed in the same category list, but C, D and F weren't.

* Added FacetsAccumulator.create() variants which support RangeAccumulator and 
either TaxonomyFacetsAccumulator or SortedSetDocValuesAccumulator. Differences 
are in the methods signatures.
** Renamed RangeFacestAccumulatorWrapper to MultiFacetsAccumulator. Also, the 
FacetResults are returned in the order of the given accumulators.
** FacetsAccumulator.create documents that you may receive ListFacetResult in 
a different order than you passed in, guaranteeing that all RangeFacetRequests 
come last.

* Modified DrillSideways to take either TaxonomyReader or 
SortedSetDVReaderState because otherise it cannot be used with SortedSetDV 
facets. Mike, can you please review it?

These changes simplified e.g. the associations examples, as now 
FacetsAccumulator.create() takes care of them too, since they implement 
createFacetsAggregator. Also, any future FacetRequest which will support 
FacetsAggregator will be supported automatically.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5144) Nuke FacetRequest.createAggregator

Shai Erera created LUCENE-5144:
--

 Summary: Nuke FacetRequest.createAggregator
 Key: LUCENE-5144
 URL: https://issues.apache.org/jira/browse/LUCENE-5144
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera


Aggregator was replaced by FacetsAggregator. FacetRequest has 
createAggregator() which by default throws an UOE. It was left there until we 
migrate the aggregators to FacetsAggregator -- now all of our requests support 
FacetsAggregator.

Aggregator is used only by StandardFacetsAccumulator, which too needs to 
vanish, at some point. But it currently it's the only one which handles 
sampling, complements aggregation and partitions.

What I'd like to do is remove FacetRequest.createAggregator and in 
StandardFacetsAccumulator support only CountFacetRequest and 
SumScoreFacetRequest, which are the only ones that make sense for sampling and 
partitions. SumScore does not even support complements (which only work for 
counting).

I'll also rename StandardFA to OldStandardFA. The plan is to eventually 
implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and 
ComplementsAggregator, removing that class entirely. Until then ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722338#comment-13722338
 ] 

Michael McCandless commented on LUCENE-4985:


Could you post a patch with --show-copies-as-adds?  (The current patch isn't 
easily applied since there were svn mvs involved...).  Thanks.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


 [ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4985:
---

Attachment: LUCENE-4985.patch

Patch with --show-copies-as-adds

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

[
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722370#comment-13722370
]

Michael McCandless commented on LUCENE-4985:

This is a nice cleanup!

It's still hard to mix all three kinds of facet requests? E.g. I
think it's realistic for an app to use SSDV for flat fields (less RAM
usage than taxo, important if there are lots of ords), range for
volatile numeric fields (e.g. time delta based), and taxo for
hierarchies.

It seems like we could have a FacetsAccumulator.create that took both
SSDVReaderState and TaxoReader and created the right
FacetsAccumulator ... and I guess we'd need a SSDVFacetRequest.

Or I guess I can just create the directly MultiFacetsAccumulator
myself ... FA.create is just sugar.

This all can wait for a follow-on issue ... these improvements are
already great.

Should we move MultiFacetsAccumulator somewhere else (out of .range
package)? It's more generic now?

bq. FacetsAccumulator.create documents that you may receive ListFacetResult
in a different order than you passed in, guaranteeing that all
RangeFacetRequests come last.

Hmm, can we fix that? (So that the order of the results matches the
order of the requests).

bq. Modified DrillSideways to take either TaxonomyReader or
SortedSetDVReaderState because otherise it cannot be used with SortedSetDV
facets. Mike, can you please review it?

Those changes look good! I think we can now simplify
TestDrillSideways (previously it had to @Override
getDrillDown/SidewaysAccumulator to use sorted set)?

Make it easier to mix different kinds of FacetRequests
--

Key: LUCENE-4985
URL: https://issues.apache.org/jira/browse/LUCENE-4985
Project: Lucene - Core
Issue Type: Improvement
Components: modules/facet
Reporter: Michael McCandless
Fix For: 5.0, 4.5

Attachments: LUCENE-4985.patch, LUCENE-4985.patch

Spinoff from LUCENE-4980, where we added a strange class called
RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the
FacetRequests into range and non-range, delegates to two accumulators for
each set, and then zips the results back together in order.
Somehow we should generalize this class and make it work with
SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722375#comment-13722375
 ] 

Shai Erera commented on LUCENE-4985:


Adding State to .create() does not simplify life for an app I think, because 
someone (on the app side) will need to figure out if State should be null or 
not. I'm worried that users will end up creating State even if they don't need 
it?

And since MultiFacetAccumulator lets you wrap any accumulator yourself, I think 
it's fine that these are separate methods, as a first step.

I'm worried about adding SortedSetDVFacetRequest, because unlike 
Count/SumScore/SumIntAssociation, this request is solely about the underlying 
source? And it also implies only counting ...

bq. Should we move MultiFacetsAccumulator somewhere else

You're right! It was left there by mistake because I renamed 
RangeAccumulatorWrapper. Will move.

{quote}
Hmm, can we fix that? (So that the order of the results matches the
order of the requests).
{quote}

I don't know how important it is ... none of our tests depend on it, and it's 
not clear to me how to fix it at all. FA.create() is a factory method. If it 
returns a single Accumulator, then it happens already (order is maintained). 
MultiFacetAccum loses the order. Maybe if we passed it the list of facet 
requests it could re-order them after accumulation, but I don't know how 
important it is... an app can put the ListFacetResult in a Map, and do 
lookups? Also, as a generic MultiFA, it's not easier to determine from which FA 
a source FacetRequest came?

bq. I think we can now simplify TestDrillSideways

You're right. Done.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


 [ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-4985:
---

Attachment: LUCENE-4985.patch

Patch with fixed comments.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722377#comment-13722377
 ] 

Michael McCandless commented on LUCENE-4985:


{quote}
I don't know how important it is ... none of our tests depend on it, and it's 
not clear to me how to fix it at all. FA.create() is a factory method. If it 
returns a single Accumulator, then it happens already (order is maintained). 
MultiFacetAccum loses the order. Maybe if we passed it the list of facet 
requests it could re-order them after accumulation, but I don't know how 
important it is... an app can put the ListFacetResult in a Map, and do 
lookups? Also, as a generic MultiFA, it's not easier to determine from which FA 
a source FacetRequest came?
{quote}

OK ...

But, I think we should not document that range facet requests come last?  
Let's leave it defined as undefined?  Maybe we should return Collection not 
List?

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722381#comment-13722381
 ] 

Shai Erera commented on LUCENE-4985:


bq. But, I think we should not document that range facet requests come last?

Ok I will remove that comment. As soon as we add more accumulators, this 
comment is not important anyway.

bq. Maybe we should return Collection not List?

Why? I prefer that we don't change that since that will change tests. Many of 
the tests do results.get(idx).
If we don't need to, let's not complicate the users? If an app does pass the 
requests in known order, it shouldn't suffer.
It's only Multi that loses order.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5086) The OR operator works incorrectly in XPathEntityProcessor

2013-07-29 Thread shenzhuxi (JIRA)

shenzhuxi created SOLR-5086:
---

 Summary: The OR operator works incorrectly in XPathEntityProcessor
 Key: SOLR-5086
 URL: https://issues.apache.org/jira/browse/SOLR-5086
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.4
Reporter: shenzhuxi


I's trying to use DataImportHandler to index RSS/ATOM feed and find bizarre 
behaviours of the OR operator in XPathEntityProcessor. 
Here is the configuration.  
?xml version=1.0 encoding=UTF-8?
dataConfig
  dataSource type=FileDataSource/
  document
entity name=rss processor=FileListEntityProcessor 
baseDir=${solr.solr.home}/feed/rss fileName=^.*\.xml$ recursive=true 
rootEntity=false dataSource=null
  entity name=feed url=${rss.fileAbsolutePath} 
processor=XPathEntityProcessor forEach=/rss/channel/item|/feed/entry 
transformer=DateFormatTransformer
field column=link 
xpath=/rss/channel/item/link|/feed/entry/link/@href/
  /entity
/entity
  /document
/dataConfig

The first OR operator in /rss/channel/item|/feed/entry works correctly. 
But the second one in /rss/channel/item/link|/feed/entry/link/@href doesn't 
work. 
If I rewrite it to either /rss/channel/item/link or /feed/entry/link/@href, 
it works correctly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722387#comment-13722387
 ] 

Michael McCandless commented on LUCENE-4985:


I just think it's a dangerous API if sometimes the order matches and sometimes 
it doesn't ... but we can pursue this separately ...

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)

Boaz Leskes created LUCENE-5145:
---

 Summary: Added AppendingPackedLongBuffer  extended 
AbstractAppendingLongBuffer family (customizable compression ratio + bulk 
retrieval)
 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes


Made acceptableOverheadRatio configurable 
Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
PackedInts as a back-end. This new class is useful where people have 
non-negative numbers with a fairly uniform distribution over a fixed (limited) 
range. Ex. facets ordinals.
To distinguish it from AppendingPackedLongBuffer, delta based 
AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
Fixed an Issue with NullReader where it didn't respect it's valueCount in bulk 
gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)


[ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722388#comment-13722388
 ] 

Boaz Leskes commented on LUCENE-5145:
-

While making the above changes I did some measurements which I feel is also 
useful to share.

PackedInts trade CPU for better CPU cache  memory usage. PackedInts gives you 
an acceptableOverheadRatio parameter to control the trade off but is not 
exposed in the AbstractAppendingLongBuffer family is based on those. This is 
especially important when you do no rely on the 
AbstractAppendingLongBuffer.iterator() to extract your data. Here is some 
experiments I run on my laptop, using BenchmarkAppendLongBufferRead which is 
included in the patch. The program allows you to play with different read 
strategies and data size and measure reading times.

This is the result of using AppendingDeltaPackedLongBuffer (previously called 
AppendingLongBuffer) to sequential read an array of 50 elements, using it's 
get method. The data was uniformly distributed numbers between 0  7. The 
program measure 10,000 such read. The total time is the time it took to perform 
all of them. You also see in the output the number of bits used to store the 
elements and the storage class used.

--- Storage: DELTA_PACKED, Read: SEQUENTIAL, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  22.18s avg:  
2.22ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 223.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  19.14s avg:  
1.91ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 521.13kb)

As you can see, when retrieving elements one by one, the byte based 
implementation slightly faster. For comparison, the new 
AppendingPackedLongBuffer with the same setup:

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  16.69s avg:  
1.67ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  13.47s avg:  
1.35ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
Next to the fact that is faster, you see the same behavior. 

For random reads, the classes display similar behavior:

--- Storage: DELTA_PACKED, Read: RANDOM, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  23.13s avg:  
2.31ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 223.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  19.38s avg:  
1.94ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 521.13kb)

--- Storage: PACKED, Read: RANDOM, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  19.23s avg:  
1.92ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  15.95s avg:  
1.60ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)


Next I looked at the effect of exposing the bulk reads offered by the 
PackedInts structures in the AppendingLongBuffer family. Here is some results 
from the new packed implementation, this time reading 4  16 consecutive 
elements in a single read.

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 4
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  11.16s avg:  
1.12ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
BULK   GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  24.22s avg:  
2.42ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   8.35s avg:  
0.84ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
BULK   GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   8.44s avg:  
0.84ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)

--- Storage: PACKED, Read: CONTINUOUS, Read size: 16
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:   9.63s avg:  
0.96ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
BULK   GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  12.52s avg:  
1.25ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   7.46s avg:  
0.75ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
BULK   GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   3.22s avg:  
0.32ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)

As you can see the bulk read api for the

[jira] [Updated] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boaz Leskes updated LUCENE-5145:


Attachment: LUCENE-5145.patch

 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
 Attachments: LUCENE-5145.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)


[ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722388#comment-13722388
 ] 

Boaz Leskes edited comment on LUCENE-5145 at 7/29/13 12:23 PM:
---

While making the above changes I did some measurements which I feel is also 
useful to share.

PackedInts trade CPU for better CPU cache  memory usage. PackedInts gives you 
an acceptableOverheadRatio parameter to control the trade off but is not 
exposed in the AbstractAppendingLongBuffer family is based on those. This is 
especially important when you do no rely on the 
AbstractAppendingLongBuffer.iterator() to extract your data. Here is some 
experiments I run on my laptop, using BenchmarkAppendLongBufferRead which is 
included in the patch. The program allows you to play with different read 
strategies and data size and measure reading times.

This is the result of using AppendingDeltaPackedLongBuffer (previously called 
AppendingLongBuffer) to sequential read an array of 50 elements, using it's 
get method. The data was uniformly distributed numbers between 0  7. The 
program measure 10,000 such read. The total time is the time it took to perform 
all of them. You also see in the output the number of bits used to store the 
elements and the storage class used.

--- Storage: DELTA_PACKED, Read: SEQUENTIAL, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  22.18s avg:  
2.22ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 223.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  19.14s avg:  
1.91ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 521.13kb)

As you can see, when retrieving elements one by one, the byte based 
implementation slightly faster. For comparison, the new 
AppendingPackedLongBuffer with the same setup:

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  16.69s avg:  
1.67ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  13.47s avg:  
1.35ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
Next to the fact that is faster, you see the same behavior. 

For random reads, the classes display similar behavior:

--- Storage: DELTA_PACKED, Read: RANDOM, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  23.13s avg:  
2.31ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 223.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  19.38s avg:  
1.94ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 521.13kb)

--- Storage: PACKED, Read: RANDOM, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  19.23s avg:  
1.92ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  15.95s avg:  
1.60ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)


Next I looked at the effect of exposing the bulk reads offered by the 
PackedInts structures in the AppendingLongBuffer family. Here is some results 
from the new packed implementation, this time reading 4  16 consecutive 
elements in a single read.

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 4
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  11.16s avg:  
1.12ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
BULK   GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  24.22s avg:  
2.42ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   8.35s avg:  
0.84ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
BULK   GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   8.44s avg:  
0.84ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 16
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:   9.63s avg:  
0.96ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
BULK   GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  12.52s avg:  
1.25ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   7.46s avg:  
0.75ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
BULK   GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   3.22s avg:  
0.32ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8,

[jira] [Comment Edited] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)


[ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722388#comment-13722388
 ] 

Boaz Leskes edited comment on LUCENE-5145 at 7/29/13 12:25 PM:
---

While making the above changes I did some measurements which I feel is also 
useful to share.

PackedInts trade CPU for better CPU cache  memory usage. PackedInts gives you 
an acceptableOverheadRatio parameter to control the trade off but is not 
exposed in the AbstractAppendingLongBuffer family is based on those. This is 
especially important when you do no rely on the 
AbstractAppendingLongBuffer.iterator() to extract your data. Here is some 
experiments I run on my laptop, using BenchmarkAppendLongBufferRead which is 
included in the patch. The program allows you to play with different read 
strategies and data size and measure reading times.

This is the result of using AppendingDeltaPackedLongBuffer (previously called 
AppendingLongBuffer) to sequential read an array of 50 elements, using it's 
get method. The data was uniformly distributed numbers between 0  7. The 
program measure 10,000 such read. The total time is the time it took to perform 
all of them. You also see in the output the number of bits used to store the 
elements and the storage class used.

--- Storage: DELTA_PACKED, Read: SEQUENTIAL, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  22.18s avg:  
2.22ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 223.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  19.14s avg:  
1.91ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 521.13kb)

As you can see, when retrieving elements one by one, the byte based 
implementation slightly faster. For comparison, the new 
AppendingPackedLongBuffer with the same setup:

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  16.69s avg:  
1.67ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  13.47s avg:  
1.35ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
Next to the fact that is faster, you see the same behavior. 

For random reads, the classes display similar behavior:

--- Storage: DELTA_PACKED, Read: RANDOM, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  23.13s avg:  
2.31ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 223.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  19.38s avg:  
1.94ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 521.13kb)

--- Storage: PACKED, Read: RANDOM, Read size: 1
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  19.23s avg:  
1.92ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:  15.95s avg:  
1.60ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)


Next I looked at the effect of exposing the bulk reads offered by the 
PackedInts structures in the AppendingLongBuffer family. Here is some results 
from the new packed implementation, this time reading 4  16 consecutive 
elements in a single read.

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 4
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  11.16s avg:  
1.12ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
BULK   GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  24.22s avg:  
2.42ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   8.35s avg:  
0.84ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
BULK   GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   8.44s avg:  
0.84ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)

--- Storage: PACKED, Read: SEQUENTIAL, Read size: 16
SINGLE GET:3 bits ratio 0.00 (i.e.,3 bits) total time:   9.63s avg:  
0.96ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
BULK   GET:3 bits ratio 0.00 (i.e.,3 bits) total time:  12.52s avg:  
1.25ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Packed64, 219.76kb)
SINGLE GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   7.46s avg:  
0.75ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8, 517.13kb)
BULK   GET:3 bits ratio 7.00 (i.e.,8 bits) total time:   3.22s avg:  
0.32ms, total read: 25 elm (class 
org.apache.lucene.util.packed.Direct8,

[jira] [Updated] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)

2013-07-29 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5145:
-

Assignee: Adrien Grand

 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
Assignee: Adrien Grand
 Attachments: LUCENE-5145.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722430#comment-13722430
 ] 

ASF subversion and git services commented on LUCENE-4985:
-

Commit 1508043 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1508043 ]

LUCENE-4985: Make it easier to mix different kinds of FacetRequests

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests

2013-07-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722436#comment-13722436
 ] 

ASF subversion and git services commented on LUCENE-4985:
-

Commit 1508046 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508046 ]

LUCENE-4985: Make it easier to mix different kinds of FacetRequests

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4985) Make it easier to mix different kinds of FacetRequests


 [ 
https://issues.apache.org/jira/browse/LUCENE-4985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-4985.


   Resolution: Fixed
 Assignee: Shai Erera
Lucene Fields: New,Patch Available  (was: New)

Committed to trunk and 4x. I think we can change .accumulate to return a 
MapFacetRequest,FacetResult, but this affects many of the tests, so let's do 
that separately.

 Make it easier to mix different kinds of FacetRequests
 --

 Key: LUCENE-4985
 URL: https://issues.apache.org/jira/browse/LUCENE-4985
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Shai Erera
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4985.patch, LUCENE-4985.patch, LUCENE-4985.patch


 Spinoff from LUCENE-4980, where we added a strange class called 
 RangeFacetsAccumulatorWrapper, which takes an incoming FSP, splits out the 
 FacetRequests into range and non-range, delegates to two accumulators for 
 each set, and then zips the results back together in order.
 Somehow we should generalize this class and make it work with 
 SortedSetDocValuesAccumulator as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #401: POMs out of sync

Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/401/

2 tests failed.
FAILED:  
org.apache.solr.cloud.AliasIntegrationTest.org.apache.solr.cloud.AliasIntegrationTest

Error Message:
1 thread leaked from SUITE scope at org.apache.solr.cloud.AliasIntegrationTest: 
   1) Thread[id=8074, name=recoveryCmdExecutor-4930-thread-1, state=RUNNABLE, 
group=TGRP-AliasIntegrationTest]
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385)
at java.net.Socket.connect(Socket.java:546)
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.cloud.AliasIntegrationTest: 
   1) Thread[id=8074, name=recoveryCmdExecutor-4930-thread-1, state=RUNNABLE, 
group=TGRP-AliasIntegrationTest]
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385)
at java.net.Socket.connect(Socket.java:546)
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
at __randomizedtesting.SeedInfo.seed([CB7D2916453BB11E]:0)


FAILED:  
org.apache.solr.cloud.AliasIntegrationTest.org.apache.solr.cloud.AliasIntegrationTest

Error Message:
There are still zombie threads that couldn't be terminated:
   1) Thread[id=8074, name=recoveryCmdExecutor-4930-thread-1, state=RUNNABLE, 
group=TGRP-AliasIntegrationTest]
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:385)

[jira] [Created] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight

Simon Willnauer created LUCENE-5146:
---

 Summary: AnalyzingSuggester sort order doesn't respect the actual 
weight
 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
 Fix For: 5.0, 4.5


Uwe would say: sorry but your code is wrong. We don't actually read the 
weight value in AnalyzingComparator which can cause really odd suggestions 
since we read parts of the input as the weight. Non of our tests catches that 
so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5147) Consider returning a MapFacetRequest,FacetResult from FacetsAccumulator

Shai Erera created LUCENE-5147:
--

 Summary: Consider returning a MapFacetRequest,FacetResult from 
FacetsAccumulator
 Key: LUCENE-5147
 URL: https://issues.apache.org/jira/browse/LUCENE-5147
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera


Today the API returns a List which suggests there's an ordering going on. This 
may be confusing if one uses FacetsAccumulator.create which results in a 
MultiFacetsAccumulator, and then the order of the results does not correspond 
to the order of the requests.

Rather than trying to enforce ordering, a simple mapping may be better even for 
consuming apps since they will be able to easily lookup desired results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4761) add option to plug in mergedsegmentwarmer

2013-07-29 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722440#comment-13722440
 ] 

Markus Jelsma commented on SOLR-4761:
-

This option reduces latency but is not enabled by default. Is there any reason 
not to enable it (by default)?
Thanks

 add option to plug in mergedsegmentwarmer
 -

 Key: SOLR-4761
 URL: https://issues.apache.org/jira/browse/SOLR-4761
 Project: Solr
  Issue Type: New Feature
Reporter: Robert Muir
 Fix For: 5.0, 4.4

 Attachments: SOLR-4761.patch, SOLR-4761.patch


 This is pretty expert, but can be useful in some cases. 
 We can also provide a simple minimalist implementation that just ensures 
 datastructures are primed so the first queries aren't e.g. causing norms to 
 be read from disk etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight


 [ 
https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5146:


Attachment: LUCENE-5146.patch

here is a patch

 AnalyzingSuggester sort order doesn't respect the actual weight
 ---

 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5146.patch


 Uwe would say: sorry but your code is wrong. We don't actually read the 
 weight value in AnalyzingComparator which can cause really odd suggestions 
 since we read parts of the input as the weight. Non of our tests catches that 
 so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5148) SortedSetDocValues caching / state

Adrien Grand created LUCENE-5148:


 Summary: SortedSetDocValues caching / state
 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Priority: Minor


I just spent some time digging into a bug which was due to the fact that 
SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
thread. So if you try to get two instances from the same field in the same 
thread, you will actually get the same instance and won't be able to iterate 
over ords of two documents in parallel.

This is not necessarily a bug, this behavior can be documented, but I think it 
would be nice if the API could prevent from such mistakes by storing the state 
in a separate object or cloning the SortedSetDocValues object in 
SegmentCoreReaders.getSortedSetDocValues?

What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state


[ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722449#comment-13722449
 ] 

Simon Willnauer commented on LUCENE-5148:
-

+1 on removing the trap. Yet, it would be nice to make this object entirely 
stateless if possible. I can think of 2 options:

{noformat}

public LongsRef getOrds(int docId, LongsRef spare)

{noformat}

this has the advantage that we can easily reuse a LongsRef on top which is kind 
of consistent with other API in Lucene 

or maybe add an OrdsIterator like this

{noformat}

public OrdsIter getOrds(int docId, OrdsIter spare)

// Iterate like this:
int ord;
while( (ord = ordsIter.nextOrd()) != NO_MORE_ORDS) {
  ...
}
{noformat}

mainly thinking about consistency regarding other apis here but I don't like 
the stateful API we have right now.

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Priority: Minor

 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms


 [ 
https://issues.apache.org/jira/browse/LUCENE-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5149:


  Component/s: modules/other
Affects Version/s: 4.4
Fix Version/s: 4.5
   5.0
 Assignee: Simon Willnauer

 CommonTermsQuery should allow minNrShouldMatch for high  low freq terms
 

 Key: LUCENE-5149
 URL: https://issues.apache.org/jira/browse/LUCENE-5149
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.4
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.5


 Currently CommonTermsQuery only allows a minShouldMatch for the low frequent 
 query. Yet, we should also allow this for the high frequent part to have 
 better control over scoring. here is an ES issue that is related to this:
 https://github.com/elasticsearch/elasticsearch/issues/3188

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms

Simon Willnauer created LUCENE-5149:
---

 Summary: CommonTermsQuery should allow minNrShouldMatch for high  
low freq terms
 Key: LUCENE-5149
 URL: https://issues.apache.org/jira/browse/LUCENE-5149
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Simon Willnauer
Priority: Minor


Currently CommonTermsQuery only allows a minShouldMatch for the low frequent 
query. Yet, we should also allow this for the high frequent part to have better 
control over scoring. here is an ES issue that is related to this:

https://github.com/elasticsearch/elasticsearch/issues/3188

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5149) CommonTermsQuery should allow minNrShouldMatch for high low freq terms


 [ 
https://issues.apache.org/jira/browse/LUCENE-5149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5149:


Attachment: LUCENE-5149.patch

here is a patch

 CommonTermsQuery should allow minNrShouldMatch for high  low freq terms
 

 Key: LUCENE-5149
 URL: https://issues.apache.org/jira/browse/LUCENE-5149
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/other
Affects Versions: 4.4
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5149.patch


 Currently CommonTermsQuery only allows a minShouldMatch for the low frequent 
 query. Yet, we should also allow this for the high frequent part to have 
 better control over scoring. here is an ES issue that is related to this:
 https://github.com/elasticsearch/elasticsearch/issues/3188

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state


[ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722475#comment-13722475
 ] 

Robert Muir commented on LUCENE-5148:
-

these other options have downsides too.

LongsRef has all the disadvantages of the *Ref APIs (e.g. reuse bugs), also 
requires reading all the ordinals into RAM at once.

Adding an additional iterator just pushes the problem into a different place to 
me, and makes the api more complex.

The current threadlocal + state is at least simple, consistent with all of the 
other docvalues, and documented that it works this way.

If we want to change the API, then I think we need to consider all of these 
issues.

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Priority: Minor

 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5148) SortedSetDocValues caching / state


[ 
https://issues.apache.org/jira/browse/LUCENE-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722481#comment-13722481
 ] 

Robert Muir commented on LUCENE-5148:
-

{quote}
This is not necessarily a bug, this behavior can be documented, but I think it 
would be nice if the API could prevent from such mistakes by storing the state 
in a separate object or cloning the SortedSetDocValues object in 
SegmentCoreReaders.getSortedSetDocValues?
{quote}

An auto-clone could also cause traps, e.g. if someone is calling this method 
multiple times and its refilling buffers and so on. 

But adding clone to the api (so someone could do this explicitly for these 
expert cases) might be a good solution too.

 SortedSetDocValues caching / state
 --

 Key: LUCENE-5148
 URL: https://issues.apache.org/jira/browse/LUCENE-5148
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Priority: Minor

 I just spent some time digging into a bug which was due to the fact that 
 SORTED_SET doc values are stateful (setDocument/nextOrd) and are cached per 
 thread. So if you try to get two instances from the same field in the same 
 thread, you will actually get the same instance and won't be able to iterate 
 over ords of two documents in parallel.
 This is not necessarily a bug, this behavior can be documented, but I think 
 it would be nice if the API could prevent from such mistakes by storing the 
 state in a separate object or cloning the SortedSetDocValues object in 
 SegmentCoreReaders.getSortedSetDocValues?
 What do you think?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5144) Nuke FacetRequest.createAggregator

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera updated LUCENE-5144:
---

Attachment: LUCENE-5144.patch

Patch removes FacetRequest.createAggregator (NOTE: *not*
createFacetsAggregator) and replaces it by
StandardFacetsAccumulator.createAggregator(FacetRequest).

I also renamed SFA to OldFacetsAccumulator and moved it and all associated
classes under o.a.l.facet.old, in the intention of removing them one day.

Nuke FacetRequest.createAggregator
--

Key: LUCENE-5144
URL: https://issues.apache.org/jira/browse/LUCENE-5144
Project: Lucene - Core
Issue Type: Improvement
Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
Attachments: LUCENE-5144.patch

Aggregator was replaced by FacetsAggregator. FacetRequest has
createAggregator() which by default throws an UOE. It was left there until we
migrate the aggregators to FacetsAggregator -- now all of our requests
support FacetsAggregator.
Aggregator is used only by StandardFacetsAccumulator, which too needs to
vanish, at some point. But it currently it's the only one which handles
sampling, complements aggregation and partitions.
What I'd like to do is remove FacetRequest.createAggregator and in
StandardFacetsAccumulator support only CountFacetRequest and
SumScoreFacetRequest, which are the only ones that make sense for sampling
and partitions. SumScore does not even support complements (which only work
for counting).
I'll also rename StandardFA to OldStandardFA. The plan is to eventually
implement a SamplingAccumulator, PartitionsAccumulator/Aggregator and
ComplementsAggregator, removing that class entirely. Until then ...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator

2013-07-29 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722498#comment-13722498
]

Shai Erera commented on LUCENE-5144:

Tests pass, if there are no objections, I intend to commit this shortly.

Nuke FacetRequest.createAggregator
--

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722505#comment-13722505
]

ASF subversion and git services commented on LUCENE-5144:
-

Commit 1508085 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1508085 ]

LUCENE-5144: remove FacetRequest.createAggregator, rename
StandardFacetsAccumulator to OldFA and move it and associated classes under
o.a.l.facet.old

Nuke FacetRequest.createAggregator
--

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator

2013-07-29 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722509#comment-13722509
]

ASF subversion and git services commented on LUCENE-5144:
-

Commit 1508087 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508087 ]

LUCENE-5144: remove FacetRequest.createAggregator, rename
StandardFacetsAccumulator to OldFA and move it and associated classes under
o.a.l.facet.old

Nuke FacetRequest.createAggregator
--

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5144) Nuke FacetRequest.createAggregator

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera resolved LUCENE-5144.

Resolution: Fixed
Fix Version/s: 4.5
5.0

Committed to trunk and 4x.

Nuke FacetRequest.createAggregator
--

Attachments: LUCENE-5144.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5086) The OR operator works incorrectly in XPathEntityProcessor

2013-07-29 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5086.
-

Resolution: Not A Problem

The XPathEntityProcessor does not support the OR operator in field xpaths. The 
OR operator is supported only in the forEach attribute of entity.

See the supported xpath types here:
http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1

 The OR operator works incorrectly in XPathEntityProcessor
 -

 Key: SOLR-5086
 URL: https://issues.apache.org/jira/browse/SOLR-5086
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.4
Reporter: shenzhuxi

 I's trying to use DataImportHandler to index RSS/ATOM feed and find bizarre 
 behaviours of the OR operator in XPathEntityProcessor. 
 Here is the configuration.  
 ?xml version=1.0 encoding=UTF-8?
 dataConfig
   dataSource type=FileDataSource/
   document
 entity name=rss processor=FileListEntityProcessor 
 baseDir=${solr.solr.home}/feed/rss fileName=^.*\.xml$ recursive=true 
 rootEntity=false dataSource=null
   entity name=feed url=${rss.fileAbsolutePath} 
 processor=XPathEntityProcessor forEach=/rss/channel/item|/feed/entry 
 transformer=DateFormatTransformer
 field column=link 
 xpath=/rss/channel/item/link|/feed/entry/link/@href/
   /entity
 /entity
   /document
 /dataConfig
 The first OR operator in /rss/channel/item|/feed/entry works correctly. 
 But the second one in /rss/channel/item/link|/feed/entry/link/@href doesn't 
 work. 
 If I rewrite it to either /rss/channel/item/link or 
 /feed/entry/link/@href, it works correctly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Solr realtime get vs. direct get

2013-07-29 Thread Jack Krupansky

The Solr realtime get feature is currently documented as “Realtime-get 
currently relies on the update log feature”. Which is certainly true for the 
realtime aspect of the operation, but it happens that the /get handler works 
just fine when the update log feature is turned off, or if a requested ID is 
not in the uncommitted documents – it simply fetches committed documents rather 
than uncommitted documents.

So, is “direct get” a non-feature or mis-feature and should not be used when 
the update log is disabled, or should it be a fully advertised first-class 
feature that is a convenient and efficient way to directly access committed 
documents via a list if IDs?

I think at least a couple of committers should weigh in as to whether this 
“apparent feature” is a true feature or a non-feature to be discouraged – and 
then clearly document it as such. If it is a non-feature, then the code should 
throw a clear exception.

My vote is that it be given first-class feature status - that it be advertised 
as “direct get” (or similar) and that real-time get is a more specialized 
sub-feature of it.

-- Jack Krupansky

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_25) - Build # 6785 - Still Failing!

2013-07-29 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/6785/
Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 14980 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:389: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:88: The following 
files contain @author tags, tabs or nocommits:
* lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java

Total time: 56 minutes 6 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4188 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4188/

All tests passed

Build Log:
[...truncated 15115 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:389:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:88:
 The following files contain @author tags, tabs or nocommits:
* lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java

Total time: 68 minutes 33 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression


 [ 
https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5127:


Attachment: LUCENE-5127.patch

patch with RAMOutputStream approach (so we don't compress/uncompress/recompress)

 FixedGapTermsIndex should use monotonic compression
 ---

 Key: LUCENE-5127
 URL: https://issues.apache.org/jira/browse/LUCENE-5127
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch, 
 LUCENE-5127.patch


 for the addresses in the big in-memory byte[] and disk blocks, we could save 
 a good deal of RAM here.
 I think this codec just never got upgraded when we added these new packed 
 improvements, but it might be interesting to try to use for the terms data of 
 sorted/sortedset DV implementations.
 patch works, but has nocommits and currently ignores the divisor. The 
 annoying problem there being that we have the shared interface with 
 get(int) for PackedInts.Mutable/Reader, but no equivalent base class for 
 monotonics get(long)... 
 Still its enough that we could benchmark/compare for now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.8.0-ea-b99) - Build # 6707 - Still Failing!

2013-07-29 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6707/
Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 15151 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:395: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:88: The following 
files contain @author tags, tabs or nocommits:
* lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java

Total time: 44 minutes 21 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression


[ 
https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722649#comment-13722649
 ] 

Adrien Grand commented on LUCENE-5127:
--

+1

 FixedGapTermsIndex should use monotonic compression
 ---

 Key: LUCENE-5127
 URL: https://issues.apache.org/jira/browse/LUCENE-5127
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch, 
 LUCENE-5127.patch


 for the addresses in the big in-memory byte[] and disk blocks, we could save 
 a good deal of RAM here.
 I think this codec just never got upgraded when we added these new packed 
 improvements, but it might be interesting to try to use for the terms data of 
 sorted/sortedset DV implementations.
 patch works, but has nocommits and currently ignores the divisor. The 
 annoying problem there being that we have the shared interface with 
 get(int) for PackedInts.Mutable/Reader, but no equivalent base class for 
 monotonics get(long)... 
 Still its enough that we could benchmark/compare for now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (LUCENE-5144) Nuke FacetRequest.createAggregator

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hoss Man reopened LUCENE-5144:
--

Shai: your commited changes for this issue included a nocommit comment.
rmuir changed it to a TODO in these commits...

http://svn.apache.org/r1508137
http://svn.apache.org/r1508139

...if this is an appropriate change and your goal was to address this on a more
long term basis, then just re-resolve, but i wanted t omake sure it was on your
radar in case this is a genuine this code should not have been committed as
is situation.

Nuke FacetRequest.createAggregator
--

Attachments: LUCENE-5144.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5144) Nuke FacetRequest.createAggregator

2013-07-29 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722653#comment-13722653
]

Robert Muir commented on LUCENE-5144:
-

Thanks Hoss, I almost forgot!

I changed the nocommit to a TODO temporarily just to unbreak jenkins.

Nuke FacetRequest.createAggregator
--

Attachments: LUCENE-5144.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression


[ 
https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722697#comment-13722697
 ] 

ASF subversion and git services commented on LUCENE-5127:
-

Commit 1508147 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1508147 ]

LUCENE-5127: FixedGapTermsIndex should use monotonic compression

 FixedGapTermsIndex should use monotonic compression
 ---

 Key: LUCENE-5127
 URL: https://issues.apache.org/jira/browse/LUCENE-5127
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch, 
 LUCENE-5127.patch


 for the addresses in the big in-memory byte[] and disk blocks, we could save 
 a good deal of RAM here.
 I think this codec just never got upgraded when we added these new packed 
 improvements, but it might be interesting to try to use for the terms data of 
 sorted/sortedset DV implementations.
 patch works, but has nocommits and currently ignores the divisor. The 
 annoying problem there being that we have the shared interface with 
 get(int) for PackedInts.Mutable/Reader, but no equivalent base class for 
 monotonics get(long)... 
 Still its enough that we could benchmark/compare for now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b99) - Build # 6786 - Still Failing!

2013-07-29 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/6786/
Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC

All tests passed

Build Log:
[...truncated 15029 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:389: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:88: The following 
files contain @author tags, tabs or nocommits:
* lucene/facet/src/java/org/apache/lucene/facet/old/OldFacetsAccumulator.java

Total time: 41 minutes 28 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.8.0-ea-b99 -server -XX:+UseSerialGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5127) FixedGapTermsIndex should use monotonic compression

[
https://issues.apache.org/jira/browse/LUCENE-5127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir resolved LUCENE-5127.
-

Resolution: Fixed
Fix Version/s: 5.0

resolving for trunk only. I think the situation is already confusing in 4.x and
backporting seems risky...

FixedGapTermsIndex should use monotonic compression
---

Key: LUCENE-5127
URL: https://issues.apache.org/jira/browse/LUCENE-5127
Project: Lucene - Core
Issue Type: Improvement
Reporter: Robert Muir
Fix For: 5.0

Attachments: LUCENE-5127.patch, LUCENE-5127.patch, LUCENE-5127.patch,
LUCENE-5127.patch

for the addresses in the big in-memory byte[] and disk blocks, we could save
a good deal of RAM here.
I think this codec just never got upgraded when we added these new packed
improvements, but it might be interesting to try to use for the terms data of
sorted/sortedset DV implementations.
patch works, but has nocommits and currently ignores the divisor. The
annoying problem there being that we have the shared interface with
get(int) for PackedInts.Mutable/Reader, but no equivalent base class for
monotonics get(long)...
Still its enough that we could benchmark/compare for now.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4583) StraightBytesDocValuesField fails if bytes 32k

2013-07-29 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722740#comment-13722740
 ] 

David Smiley commented on LUCENE-4583:
--

Cool; I didn't know of the Facet42 codec with its support for large doc values. 
 Looks like I can use it without faceting.  I'll have to try that.

+1 to commit.

 StraightBytesDocValuesField fails if bytes  32k
 

 Key: LUCENE-4583
 URL: https://issues.apache.org/jira/browse/LUCENE-4583
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0, 4.1, 5.0
Reporter: David Smiley
Priority: Critical
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch, 
 LUCENE-4583.patch, LUCENE-4583.patch, LUCENE-4583.patch


 I didn't observe any limitations on the size of a bytes based DocValues field 
 value in the docs.  It appears that the limit is 32k, although I didn't get 
 any friendly error telling me that was the limit.  32k is kind of small IMO; 
 I suspect this limit is unintended and as such is a bug.The following 
 test fails:
 {code:java}
   public void testBigDocValue() throws IOException {
 Directory dir = newDirectory();
 IndexWriter writer = new IndexWriter(dir, writerConfig(false));
 Document doc = new Document();
 BytesRef bytes = new BytesRef((4+4)*4097);//4096 works
 bytes.length = bytes.bytes.length;//byte data doesn't matter
 doc.add(new StraightBytesDocValuesField(dvField, bytes));
 writer.addDocument(doc);
 writer.commit();
 writer.close();
 DirectoryReader reader = DirectoryReader.open(dir);
 DocValues docValues = MultiDocValues.getDocValues(reader, dvField);
 //FAILS IF BYTES IS BIG!
 docValues.getSource().getBytes(0, bytes);
 reader.close();
 dir.close();
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5145) Added AppendingPackedLongBuffer extended AbstractAppendingLongBuffer family (customizable compression ratio + bulk retrieval)


[ 
https://issues.apache.org/jira/browse/LUCENE-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722770#comment-13722770
 ] 

Adrien Grand commented on LUCENE-5145:
--

Thanks Boaz, the patch looks very good!
 - I like the fact that the addition of the new bulk API helped make fillValues 
final!
 - OrdinalMap.subIndexes, SortedDocValuesWriter.pending and 
SortedSetDocValuesWriter.pending are 0-based so they could use the new 
{{AppendingPackedLongBuffer}} instead of {{AppendingDeltaPackedLongBuffer}}, 
can you update the patch?



 Added AppendingPackedLongBuffer  extended AbstractAppendingLongBuffer family 
 (customizable compression ratio + bulk retrieval)
 ---

 Key: LUCENE-5145
 URL: https://issues.apache.org/jira/browse/LUCENE-5145
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Boaz Leskes
Assignee: Adrien Grand
 Attachments: LUCENE-5145.patch


 Made acceptableOverheadRatio configurable 
 Added bulk get to AbstractAppendingLongBuffer classes, for faster retrieval.
 Introduced a new variant, AppendingPackedLongBuffer which solely relies on 
 PackedInts as a back-end. This new class is useful where people have 
 non-negative numbers with a fairly uniform distribution over a fixed 
 (limited) range. Ex. facets ordinals.
 To distinguish it from AppendingPackedLongBuffer, delta based 
 AppendingLongBuffer was renamed to AppendingDeltaPackedLongBuffer
 Fixed an Issue with NullReader where it didn't respect it's valueCount in 
 bulk gets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5150) WAH8DocIdSet: dense sets compression

Adrien Grand created LUCENE-5150:


 Summary: WAH8DocIdSet: dense sets compression
 Key: LUCENE-5150
 URL: https://issues.apache.org/jira/browse/LUCENE-5150
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5150) WAH8DocIdSet: dense sets compression


 [ 
https://issues.apache.org/jira/browse/LUCENE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5150:
-

Description: In LUCENE-5101, Paul Elschot mentioned that it would be 
interesting to be able to encode the inverse set to also compress very dense 
sets.

 WAH8DocIdSet: dense sets compression
 

 Key: LUCENE-5150
 URL: https://issues.apache.org/jira/browse/LUCENE-5150
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial

 In LUCENE-5101, Paul Elschot mentioned that it would be interesting to be 
 able to encode the inverse set to also compress very dense sets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5122) DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord


 [ 
https://issues.apache.org/jira/browse/LUCENE-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5122:


Attachment: LUCENE-5122.patch

here's a patch. Ill do some benchmarking.

 DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord
 --

 Key: LUCENE-5122
 URL: https://issues.apache.org/jira/browse/LUCENE-5122
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: LUCENE-5122.patch


 I dont think blocking provides any benefit here in general. we can assume 
 the ordinals are essentially random and since SortedDV is single-valued, its 
 probably better to just use the simpler packedints directly? 
 I guess the only case where it would help is if you sorted your segments by 
 that DV field. But that seems kinda wierd/esoteric to sort your index by a 
 deref'ed string value, e.g. I don't think its even supported by SortingMP.
 For the SortedSet ord stream, this can exceed 2B values so for now I think 
 it should stay as blockpackedreader. but it could use a large blocksize...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5150) WAH8DocIdSet: dense sets compression


 [ 
https://issues.apache.org/jira/browse/LUCENE-5150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5150:
-

Attachment: LUCENE-5150.patch

Here is a patch. It reserves an additional bit in the header to say whether the 
encoding should be inversed (meaning clean words are actually 0xFF instead of 
0x00).

It should reduce the amount of memory required to build and store dense sets. 
In spite of this change, compression ratios remain the same for sparse sets.

For random dense sets, I observed compression ratios of 87% when the load 
factor is 90% and 20% when the load factor is 99% (vs. 100% before).

 WAH8DocIdSet: dense sets compression
 

 Key: LUCENE-5150
 URL: https://issues.apache.org/jira/browse/LUCENE-5150
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Trivial
 Attachments: LUCENE-5150.patch


 In LUCENE-5101, Paul Elschot mentioned that it would be interesting to be 
 able to encode the inverse set to also compress very dense sets.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5144) Nuke FacetRequest.createAggregator

[
https://issues.apache.org/jira/browse/LUCENE-5144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera resolved LUCENE-5144.

Resolution: Fixed

Thanks Hoss and Rob. Sorry for letting this nocommit slip through. I removed
the TODO as the intention was to remove that piece of code.

Nuke FacetRequest.createAggregator
--

Attachments: LUCENE-5144.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5147) Consider returning a MapFacetRequest,FacetResult from FacetsAccumulator


 [ 
https://issues.apache.org/jira/browse/LUCENE-5147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5147.


Resolution: Won't Fix

I started to do it, but this has a large impact on tests. I don't see how much 
value it brings, plus an app can easily put the results in a map and lookup 
requests:

{code}
MapFacetRequest,FacetResult results = new HashMap();
for (FacetResult fres : facetResults) {
  results.put(fres.getFacetRequest(), fres);
}
{code}

Resolving as Won't Fix for now, if this will be a problem we can reopen.

 Consider returning a MapFacetRequest,FacetResult from FacetsAccumulator
 -

 Key: LUCENE-5147
 URL: https://issues.apache.org/jira/browse/LUCENE-5147
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera

 Today the API returns a List which suggests there's an ordering going on. 
 This may be confusing if one uses FacetsAccumulator.create which results in a 
 MultiFacetsAccumulator, and then the order of the results does not correspond 
 to the order of the requests.
 Rather than trying to enforce ordering, a simple mapping may be better even 
 for consuming apps since they will be able to easily lookup desired results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #923: POMs out of sync

Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/923/

2 tests failed.
FAILED:  
org.apache.solr.cloud.BasicDistributedZkTest.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
1 thread leaked from SUITE scope at 
org.apache.solr.cloud.BasicDistributedZkTest: 
   1) Thread[id=4826, name=recoveryCmdExecutor-2303-thread-1, state=RUNNABLE, 
group=TGRP-BasicDistributedZkTest]
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

Stack Trace:
com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from SUITE 
scope at org.apache.solr.cloud.BasicDistributedZkTest: 
   1) Thread[id=4826, name=recoveryCmdExecutor-2303-thread-1, state=RUNNABLE, 
group=TGRP-BasicDistributedZkTest]
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:127)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at org.apache.solr.cloud.SyncStrategy$1.run(SyncStrategy.java:291)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
at __randomizedtesting.SeedInfo.seed([2F2D4C74C67902F4]:0)


FAILED:  
org.apache.solr.cloud.BasicDistributedZkTest.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
There are still zombie threads that couldn't be terminated:
   1) Thread[id=4826, name=recoveryCmdExecutor-2303-thread-1, state=RUNNABLE, 
group=TGRP-BasicDistributedZkTest]
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at

[jira] [Commented] (SOLR-4981) BasicDistributedZkTest fails on FreeBSD jenkins due to thread leak.

2013-07-29 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722891#comment-13722891
 ] 

Mark Miller commented on SOLR-4981:
---

Just tried tweaking the connect timeout - it was fairly high at 45 seconds and 
the thread linger may just not have been long enough. I dropped it to 15s and 
will see how that goes.

 BasicDistributedZkTest fails on FreeBSD jenkins due to thread leak.
 ---

 Key: SOLR-4981
 URL: https://issues.apache.org/jira/browse/SOLR-4981
 Project: Solr
  Issue Type: Test
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5059) 4.4 refguide pages on schemaless schema rest api for adding fields

2013-07-29 Thread Steve Rowe (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722961#comment-13722961
 ] 

Steve Rowe commented on SOLR-5059:
--

bq. I just happened to be looking at the FAQ yesterday and noticed that it has 
a question Does Solr support schemaless mode? that probably needs to reflect 
this new support.

Thanks Jack, I've updated the answer.

 4.4 refguide pages on schemaless  schema rest api for adding fields
 

 Key: SOLR-5059
 URL: https://issues.apache.org/jira/browse/SOLR-5059
 Project: Solr
  Issue Type: Sub-task
  Components: documentation
Reporter: Hoss Man
Assignee: Steve Rowe

 breaking off from parent...
 * 
 https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design
 ** SOLR-4897: Add solr/example/example-schemaless/, an example config set for 
 schemaless mode. (Steve Rowe)
 *** CT: Schemaless in general needs to be added. The most likely place today 
 is a new page under 
 https://cwiki.apache.org/confluence/display/solr/Documents%2C+Fields%2C+and+Schema+Design
 * https://cwiki.apache.org/confluence/display/solr/Schema+API
 ** SOLR-3251: Dynamically add fields to schema. (Steve Rowe, Robert Muir, 
 yonik)
 *** CT: Add to https://cwiki.apache.org/confluence/display/solr/Schema+API
 ** SOLR-5010: Add support for creating copy fields to the Fields REST API 
 (gsingers)
 *** CT: Add to https://cwiki.apache.org/confluence/display/solr/Schema+API

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4335) Builds should regenerate all generated sources

2013-07-29 Thread Steve Rowe (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722974#comment-13722974
]

Steve Rowe commented on LUCENE-4335:

bq. I don't want to setup a fixed JFlex on Jenkins, I want to download it with
IVY, so before resolving this issue we should have a JFlex version available.
If Steve Rowe is not able to relaese the version on Maven, we should maybe fork
jflex on Google Code and make a release including the ANT task.

I can't promise I'll release JFlex anytime soon, sorry. If you want to fork,
you can certainly do that. FYI, Gerwin Klein, the JFlex founder, has done some
work (maybe all that needs to be done? not sure at this point) to convert JFlex
to a BSD license. I'll review the source and see what state that effort is in
- BSD licensing should simplify forking, I think.

Builds should regenerate all generated sources
--

Key: LUCENE-4335
URL: https://issues.apache.org/jira/browse/LUCENE-4335
Project: Lucene - Core
Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Attachments: LUCENE-4335.patch, LUCENE-4335.patch, LUCENE-4335.patch

We have more and more sources that are generated programmatically (query
parsers, fuzzy levN tables from Moman, packed ints specialized decoders,
etc.), and it's dangerous because developers may directly edit the generated
sources and forget to edit the meta-source. It's happened to me several
times ... most recently just after landing the BlockPostingsFormat branch.
I think we should re-gen all of these in our builds and fail the build if
this creates a difference. I know some generators (eg JavaCC) embed
timestamps and so always create mods ... we can leave them out of this for
starters (or maybe post-process the sources to remove the timestamps) ...

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5146) AnalyzingSuggester sort order doesn't respect the actual weight

2013-07-29 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722981#comment-13722981
 ] 

Uwe Schindler commented on LUCENE-5146:
---

Sorry but your code is of course wrong :-)

 AnalyzingSuggester sort order doesn't respect the actual weight
 ---

 Key: LUCENE-5146
 URL: https://issues.apache.org/jira/browse/LUCENE-5146
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/spellchecker
Affects Versions: 4.4
Reporter: Simon Willnauer
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5146.patch


 Uwe would say: sorry but your code is wrong. We don't actually read the 
 weight value in AnalyzingComparator which can cause really odd suggestions 
 since we read parts of the input as the weight. Non of our tests catches that 
 so I will go ahead and add some tests for it as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5082) Implement ie=charset parameter

2013-07-29 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13722988#comment-13722988
 ] 

Uwe Schindler commented on SOLR-5082:
-

[~elyograg]: Are you fine with this code?

From my tests here I have seen no slowdown for query-string parsing, it is as 
fast as before, every slowdown is smaller than measureable. In any case, the 
current URLDecoder is much more efficient than the one embedded into Jetty 
(the one with broken UTF8 in earlier versions). The slowest part in the whole 
code is MultiMapSolrParams#add, because it reallocates arrays all the time on 
duplicate keys...

 Implement ie=charset parameter
 --

 Key: SOLR-5082
 URL: https://issues.apache.org/jira/browse/SOLR-5082
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.4
Reporter: Shawn Heisey
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: SOLR-5082.patch, SOLR-5082.patch


 Allow a user to send a query or update to Solr in a character set other than 
 UTF-8 and inform Solr what charset to use with an ie parameter, for input 
 encoding.  This was discussed in SOLR-4265 and SOLR-4283.
 Changing the default charset is a bad idea because distributed search 
 (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4953) solrconfig.xml parsing should fail hard if there are multiple indexConfig/ blocks


 [ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4953:
---

Attachment: SOLR-4953.patch

it ocured to me last night that instead of just dealing explicitly with 
indexConfig here, we could probably help improve the validation of a lot of 
config parsing with a relatively simple change to Config.getNode: throw an 
error in any case where Solr is looking for a single Node/String/Int/Boolean 
and multiple values are found instead.

I wasn't sure how badly this might break things, but i've been testing it out 
today and except for a few cases where the text() xpath expression was getting 
abused (instead of a simple node check), it seems fairly straight forward.

So here's a patch that broadens the scope of the issue to fail hard if any 
single valued config option is found more then once in the config.

 solrconfig.xml parsing should fail hard if there are multiple indexConfig/ 
 blocks
 ---

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found

2013-07-29 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4953:
---

Description: 
while reviewing some code i think i noticed that if there are multiple 
{{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
ignored.

this should be a hard failure situation, and we should have a TestBadConfig 
method to verify it.

---

broadened goal of issue to fail if configuration contains multiple nodes/values 
for any option where only one value is expected.

  was:
while reviewing some code i think i noticed that if there are multiple 
{{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
ignored.

this should be a hard failure situation, and we should have a TestBadConfig 
method to verify it.

 Issue Type: Improvement  (was: Bug)
Summary: Config XML parsing should fail hard if an xpath is expect to 
match at most one node/string/int/boolean and multiple values are found  (was: 
solrconfig.xml parsing should fail hard if there are multiple indexConfig/ 
blocks)

 Config XML parsing should fail hard if an xpath is expect to match at most 
 one node/string/int/boolean and multiple values are found
 

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.
 ---
 broadened goal of issue to fail if configuration contains multiple 
 nodes/values for any option where only one value is expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [Solr Wiki] Update of UsingMailingLists by HossMan

2013-07-29 Thread Jack Krupansky


Just happened to notice at the end of that update:

Lucid Imagination and Sematext also maintain SOLR-powered archives at: 
http://www.lucidimagination.com/search/ and ...


s.b. LucidWorks and http://find.searchhub.org/;, althought the latter 
actually says Welcome to the temporary SearchHub. To learn more about this 
site, please click here and has a bad image URL.


In any case, the name/text is outdated.

And, on the searchhub.org menu for Reference Materials it has Solr 
Reference Guide with this link:


http://searchhub.org/category/reference-materials/solr-reference-guide-2/

That doesn't mention or link to the new Apache Solr Reference Guide.

Maybe you could pass these comments over to whoever works with SearchHub.

-- Jack Krupansky

-Original Message- 
From: Apache Wiki

Sent: Monday, July 29, 2013 6:06 PM
To: Apache Wiki
Subject: [Solr Wiki] Update of UsingMailingLists by HossMan

Dear Wiki user,

You have subscribed to a wiki page or wiki category on Solr Wiki for 
change notification.


The UsingMailingLists page has been changed by HossMan:
https://wiki.apache.org/solr/UsingMailingLists?action=diffrev1=9rev2=10

Comment:
ref guide links


 == Some general guidelines ==
  *First and foremost: Try to find the answer before posting. There's no 
faster way to get the answer to your question than finding it's already been 
answered. Some of the places to look are:

-   *The SOLR wiki at: http://lucene.apache.org/solr/.
+   * The Official Solr Documentation: 
https://lucene.apache.org/solr/documentation.html
+ * In particular, check the Solr Reference Guide for the version of 
Solr you are using, or check the 
[[https://cwiki.apache.org/confluence/display/solr/|the live draft]] of the 
next version of the guide for the latest updates.

+   * The Solr Community Wiki: https://wiki.apache.org/solr/
-   *Search the users' list archives. Try the nabble searchable archive at: 
http://old.nabble.com/Solr-f14479.html. Lucid Imagination and Sematext also 
maintain SOLR-powered archives at: http://www.lucidimagination.com/search/ 
and http://search-lucene.com/.
+   * Search the users' list archives. Try the nabble searchable archive at: 
http://old.nabble.com/Solr-f14479.html. Lucid Imagination and Sematext also 
maintain SOLR-powered archives at: http://www.lucidimagination.com/search/ 
and http://search-lucene.com/.
   *And, of course, web searches (Google, Cuil, or other favorite web 
search engine).


  *Be aware of all the advice in the extremely well written: 
[[http://catb.org/~esr/faqs/smart-questions.html|How to ask questions the 
smart way]] 



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-29 Thread Patrick Hunt (JIRA)

Patrick Hunt created SOLR-5087:
--

 Summary: CoreAdminHandler.handleMergeAction generating 
NullPointerException
 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 5.0, 4.5


CoreAdminHandler.handleMergeAction is generating NullPointerException

If directoryFactory.get(...) in handleMergeAction throws an exception the 
original error is lost as the finally clause will attempt to clean up and 
generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
that are not filled in)

{noformat}
ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
at 
org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
at 
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-29 Thread Patrick Hunt (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated SOLR-5087:
---

Attachment: SOLR-5087.patch

This patch fixes the problem by catching/logging/rethrowing the original 
problem. I've also made some changes to the code to make it less likely that 
the cleanup (finally clause) will fail.

The test I added fails w/o the fix applied.

This patch applies/passes for me on both trunk and branch4x.

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5082) Implement ie=charset parameter


[ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723126#comment-13723126
 ] 

ASF subversion and git services commented on SOLR-5082:
---

Commit 1508237 from [~thetaphi] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1508237 ]

Merged revision(s) 1508236 from lucene/dev/trunk:
SOLR-5082: The encoding of URL-encoded query parameters can be changed with the 
ie (input encoding) parameter, e.g. select?q=m%FCllerie=ISO-8859-1. The 
default is UTF-8. To change the encoding of POSTed content, use the 
Content-Type HTTP header

 Implement ie=charset parameter
 --

 Key: SOLR-5082
 URL: https://issues.apache.org/jira/browse/SOLR-5082
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.4
Reporter: Shawn Heisey
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: SOLR-5082.patch, SOLR-5082.patch


 Allow a user to send a query or update to Solr in a character set other than 
 UTF-8 and inform Solr what charset to use with an ie parameter, for input 
 encoding.  This was discussed in SOLR-4265 and SOLR-4283.
 Changing the default charset is a bad idea because distributed search 
 (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5082) Implement ie=charset parameter

2013-07-29 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723124#comment-13723124
 ] 

ASF subversion and git services commented on SOLR-5082:
---

Commit 1508236 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1508236 ]

SOLR-5082: The encoding of URL-encoded query parameters can be changed with the 
ie (input encoding) parameter, e.g. select?q=m%FCllerie=ISO-8859-1. The 
default is UTF-8. To change the encoding of POSTed content, use the 
Content-Type HTTP header

 Implement ie=charset parameter
 --

 Key: SOLR-5082
 URL: https://issues.apache.org/jira/browse/SOLR-5082
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.4
Reporter: Shawn Heisey
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: SOLR-5082.patch, SOLR-5082.patch


 Allow a user to send a query or update to Solr in a character set other than 
 UTF-8 and inform Solr what charset to use with an ie parameter, for input 
 encoding.  This was discussed in SOLR-4265 and SOLR-4283.
 Changing the default charset is a bad idea because distributed search 
 (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5082) Implement ie=charset parameter

2013-07-29 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-5082.
-

Resolution: Fixed

 Implement ie=charset parameter
 --

 Key: SOLR-5082
 URL: https://issues.apache.org/jira/browse/SOLR-5082
 Project: Solr
  Issue Type: Improvement
Affects Versions: 4.4
Reporter: Shawn Heisey
Assignee: Uwe Schindler
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: SOLR-5082.patch, SOLR-5082.patch


 Allow a user to send a query or update to Solr in a character set other than 
 UTF-8 and inform Solr what charset to use with an ie parameter, for input 
 encoding.  This was discussed in SOLR-4265 and SOLR-4283.
 Changing the default charset is a bad idea because distributed search 
 (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions

2013-07-29 Thread Shawn Heisey (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shawn Heisey reassigned SOLR-3284:
--

Assignee: Shawn Heisey

StreamingUpdateSolrServer swallows exceptions
-

Key: SOLR-3284
URL: https://issues.apache.org/jira/browse/SOLR-3284
Project: Solr
Issue Type: Improvement
Components: clients - java
Affects Versions: 3.5, 4.0-ALPHA
Reporter: Shawn Heisey
Assignee: Shawn Heisey
Attachments: SOLR-3284.patch

StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as
HttpClient, when doing adds. It may happen with other methods, though I know
that query and deleteByQuery will throw exceptions. I believe that this is a
result of the queue/Runner design. That's what makes SUSS perform better,
but it means you sacrifice the ability to programmatically determine that
there was a problem with your update. All errors are logged via slf4j, but
that's not terribly helpful except with determining what went wrong after the
fact.
When using CommonsHttpSolrServer, I've been able to rely on getting an
exception thrown by pretty much any error, letting me use try/catch to detect
problems.
There's probably enough dependent code out there that it would not be a good
idea to change the design of SUSS, unless there were alternate constructors
or additional methods available to configure new/old behavior. Fixing this
is probably not trivial, so it's probably a better idea to come up with a new
server object based on CHSS. This is outside my current skillset.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions

2013-07-29 Thread Shawn Heisey (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723187#comment-13723187
]

Shawn Heisey commented on SOLR-3284:

I have a proposed patch that is very likely to need updating because it is so
old.

There is an issue for CloudSolrServer, the one to route documents to the
correct shard, that has a concurrent mode that apparently still will throw
exceptions. Can that be adapted for use here?

StreamingUpdateSolrServer swallows exceptions
-

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-29 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723190#comment-13723190
 ] 

Mark Miller commented on SOLR-5087:
---

Looks good to me - there is a little back compat breakage in the merge command, 
but I think that's fine. Just calling it out in case anyone else has a concern 
there. 

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-29 Thread Patrick Hunt (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723200#comment-13723200
 ] 

Patrick Hunt commented on SOLR-5087:


Oh, yes. I forgot about that, it seemed like an internal operation though. LMK 
if it should be reverted. (it was cleaner to push the List usage through, but 
not critical)

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-29 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723203#comment-13723203
 ] 

Mark Miller commented on SOLR-5087:
---

bq.  it seemed like an internal operation though

Technically it's part of the UpdatePrcoessor chain user plugin point API's - 
but we are kind of ad-hoc with back compat in these API's - I think it's rare 
enough to do something custom with the merge command that I'm not personally 
worried about it though.

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5057) queryResultCache should not related with the order of fq's list

2013-07-29 Thread Feihong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723268#comment-13723268
 ] 

Feihong Huang commented on SOLR-5057:
-

So, Can anyone make a final decision for this featrue ? 

hi, Erickson, if we decide to fix the feature,  who is responsible for submit 
the patch? 
Can i do it? 

 queryResultCache should not related with the order of fq's list
 ---

 Key: SOLR-5057
 URL: https://issues.apache.org/jira/browse/SOLR-5057
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.0, 4.1, 4.2, 4.3
Reporter: Feihong Huang
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5057.patch, SOLR-5057.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 There are two case query with the same meaning below. But the case2 can't use 
 the queryResultCache when case1 is executed.
 case1: q=*:*fq=field1:value1fq=field2:value2
 case2: q=*:*fq=field2:value2fq=field1:value1
 I think queryResultCache should not be related with the order of fq's list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5087) CoreAdminHandler.handleMergeAction generating NullPointerException

2013-07-29 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723362#comment-13723362
 ] 

Shalin Shekhar Mangar commented on SOLR-5087:
-

bq. there is a little back compat breakage in the merge command, but I think 
that's fine.

That should be fine. Patch looks good. Thanks Patrick!

 CoreAdminHandler.handleMergeAction generating NullPointerException
 --

 Key: SOLR-5087
 URL: https://issues.apache.org/jira/browse/SOLR-5087
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 5.0, 4.5

 Attachments: SOLR-5087.patch


 CoreAdminHandler.handleMergeAction is generating NullPointerException
 If directoryFactory.get(...) in handleMergeAction throws an exception the 
 original error is lost as the finally clause will attempt to clean up and 
 generate an NPE. (notice that dirsToBeReleased is pre-allocated with nulls 
 that are not filled in)
 {noformat}
 ERROR org.apache.solr.core.SolrCore: java.lang.NullPointerException
 at 
 org.apache.solr.core.CachingDirectoryFactory.release(CachingDirectoryFactory.java:430)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleMergeAction(CoreAdminHandler.java:380)
 at 
 org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:180)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-07-29 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723378#comment-13723378
 ] 

Noble Paul commented on SOLR-5081:
--

Can you please throw some more light into the system

# numShards
# Replication factor
# maxShardsPerNode (I guess it is 1)
# Average size per doc 
# VM startup params (-Xmx -Xms, GC params etc)
# How are you indexing? Are you using SolrJ and the CloudSolrServer? How many 
clients are used to index the data?

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5088) ClassCastException is thrown when trying to use custom SearchHandler.

2013-07-29 Thread Pavel Yaskevich (JIRA)

Pavel Yaskevich created SOLR-5088:
-

 Summary: ClassCastException is thrown when trying to use custom 
SearchHandler.
 Key: SOLR-5088
 URL: https://issues.apache.org/jira/browse/SOLR-5088
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Pavel Yaskevich


Hi guys,

  I'm trying to replace solr.SearchHandler to custom one in solrconfig.xml for 
one of the stores, and it's throwing following exception: 

{noformat}
Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
at 
org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
at org.apache.solr.core.SolrCore.init(SolrCore.java:772)
... 13 more
Caused by: org.apache.solr.common.SolrException: Error Instantiating Request 
Handler, org.my.solr.index.CustomSearchHandler failed to instantiate 
org.apache.solr.request.SolrRequestHandler
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:603)
at 
org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
... 14 more
Caused by: java.lang.ClassCastException: class 
org.my.solr.index.CustomSearchHandler
at java.lang.Class.asSubclass(Class.java:3116)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:433)
at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:381)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:530)
... 16 more
{noformat}

I actually tried extending SearchHandler, and implementing SolrRequestHandler 
as well as extending RequestHandlerBase and it's all the same 
ClassCastException result...

org.my.solr.index.CustomSearchHandler is definitely in class path and 
recompiled every retry. 

Maybe I'm doing something terribly wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-07-29 Thread Mike Schrag (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723402#comment-13723402
 ] 

Mike Schrag commented on SOLR-5081:
---

1. numShards=20
2. RF=3
3. maxShardsPerNode=1000 (aka just a big number .. we overcommit shards in 
this environment)
4. not very big ... maybe 0.5-1k
5. -Xms10g -Xmx10g -XX:MaxPermSize=1G -XX:+UseConcMarkSweepGC 
-XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancy
Fraction=60 -XX:-OmitStackTraceInFastThrow
6. SolrJ + CloudSolrServer + when you say clients, do you mean threads, or 
actual client JVM instances? Talking more generically in terms of threads, I 
know it works at around 15-20 threads, but 100 threads makes it go sadfaced

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 335 - Failure

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/335/

2 tests failed.
FAILED:  
org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat.testEmptyField

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([6D64DFCE9911F67B:B07B8389FB8EACE0]:0)
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.lucene.index.BasePostingsFormatTestCase.testEmptyField(BasePostingsFormatTestCase.java:1154)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:724)


FAILED:  
org.apache.lucene.codecs.simpletext.TestSimpleTextPostingsFormat.testEmptyFieldAndEmptyTerm

Error Message:


Stack Trace:
java.lang.AssertionError
at 
__randomizedtesting.SeedInfo.seed([6D64DFCE9911F67B:EF1EF4C8B9869F55]:0)
at org.junit.Assert.fail(Assert.java:92)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertTrue(Assert.java:54)
at 
org.apache.lucene.index.BasePostingsFormatTestCase.testEmptyFieldAndEmptyTerm(BasePostingsFormatTestCase.java:1177)

[jira] [Updated] (SOLR-4951) randomize merge policy testing in solr