date:20130906

[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-09-06 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760008#comment-13760008
 ] 

Shalin Shekhar Mangar commented on SOLR-4817:
-

Just fyi, the copyMinConf, copyMinFullSetup and copySolrHomeToTemp methods 
throw the following exception with Solrj tests:

{quote}
junit4] ERROR   0.69s | MultiCoreExampleJettyTest.testDeleteInstanceDir 
   [junit4] Throwable #1: java.lang.RuntimeException: Cannot find 
resource: 
/Users/shalinmangar/work/oss/solr-trunk/solr/build/solr-solrj/test/J0/solr/collection1
   [junit4]at 
__randomizedtesting.SeedInfo.seed([2AFBC83FDA207BB2:4160F4A68E96AEF0]:0)
   [junit4]at 
org.apache.solr.SolrTestCaseJ4.getFile(SolrTestCaseJ4.java:1571)
   [junit4]at 
org.apache.solr.SolrTestCaseJ4.TEST_HOME(SolrTestCaseJ4.java:1576)
   [junit4]at 
org.apache.solr.SolrTestCaseJ4.copyMinConf(SolrTestCaseJ4.java:1618)
   [junit4]at 
org.apache.solr.SolrTestCaseJ4.copyMinConf(SolrTestCaseJ4.java:1603)
   [junit4]at 
org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDeleteInstanceDir(MultiCoreExampleJettyTest.java:117)
{quote}

You can reproduce the error above with the patch in SOLR-5023

 Solr should not fall back to the back compat built in solr.xml in SolrCloud 
 mode.
 -

 Key: SOLR-4817
 URL: https://issues.apache.org/jira/browse/SOLR-4817
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, 
 SOLR-4817.patch, SOLR-4817.patch


 A hard error is much more useful, and this built in solr.xml is not very good 
 for solrcloud - with the old style solr.xml with cores in it, you won't have 
 persistence and with the new style, it's not really ideal either.
 I think it makes it easier to debug solr.home to fail on this instead - but 
 just in solrcloud mode for now due to back compat. We might want to pull the 
 whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-2562) Make Luke a Lucene/Solr Module

2013-09-06 Thread Ajay Bhat (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760019#comment-13760019
]

Ajay Bhat commented on LUCENE-2562:
---

The TokenStream reset call was needed to display the tokens generated by the
Analyzer. I think that's the only change that was required. The main problem
for me is the analyzers above are not giving the result, which I've been
looking into.
I had figured that since PatternAnalyzer was deprecated it would not give the
result and so it might be a good idea to remove it from the list of analyzers.
But there are also some analyzers that aren't deprecated, like the Snowball
Analyzer and QueryAutoStopWordAnalyzer.

Also, as per the schedule of my proposal I've done some work on the themes of
the Application. I'll contribute another patch for that soon.

Make Luke a Lucene/Solr Module
--

Key: LUCENE-2562
URL: https://issues.apache.org/jira/browse/LUCENE-2562
Project: Lucene - Core
Issue Type: Task
Reporter: Mark Miller
Labels: gsoc2013
Attachments: LUCENE-2562.patch, luke1.jpg, luke2.jpg, luke3.jpg,
Luke-ALE-1.png, Luke-ALE-2.png, Luke-ALE-3.png, Luke-ALE-4.png, Luke-ALE-5.png

see
RE: Luke - in need of maintainer:
http://markmail.org/message/m4gsto7giltvrpuf
Web-based Luke: http://markmail.org/message/4xwps7p7ifltme5q
I think it would be great if there was a version of Luke that always worked
with trunk - and it would also be great if it was easier to match Luke jars
with Lucene versions.
While I'd like to get GWT Luke into the mix as well, I think the easiest
starting point is to straight port Luke to another UI toolkit before
abstracting out DTO objects that both GWT Luke and Pivot Luke could share.
I've started slowly converting Luke's use of thinlet to Apache Pivot. I
haven't/don't have a lot of time for this at the moment, but I've plugged
away here and there over the past work or two. There is still a *lot* to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3765) Wrong handling of documents with same id in cross collection searches

2013-09-06 Thread Furkan KAMACI (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760023#comment-13760023
 ] 

Furkan KAMACI commented on SOLR-3765:
-

Did anything have done for this issue?

 Wrong handling of documents with same id in cross collection searches
 -

 Key: SOLR-3765
 URL: https://issues.apache.org/jira/browse/SOLR-3765
 Project: Solr
  Issue Type: Bug
  Components: search, SolrCloud
Affects Versions: 4.0
 Environment: Self-build version of Solr fra 4.x branch (revision )
Reporter: Per Steffensen
  Labels: collections, inconsistency, numFound, search

 Dialog with myself from solr-users mailing list:
 Per Steffensen skrev:
 {quote} 
 Hi
 Due to what we have seen in recent tests I got in doubt how Solr search is 
 actually supposed to behave
 * Searching with distrib=trueq=*:*rows=10collection=x,y,zsort=timestamp 
 asc
 ** Is Solr supposed to return the 10 documents with the lowest timestamp 
 across all documents in all slices of collection x, y and z, or is it 
 supposed to just pick 10 random documents from those slices and just sort 
 those 10 randomly selected documents?
 ** Put in another way - is this search supposed to be consistent, returning 
 exactly the same set of documents when performed several times (no documents 
 are updated between consecutive searches)?
 {quote}
 Fortunately I believe the answer is, that it ought to return the 10 
 documents with the lowest timestamp across all documents in all slices of 
 collection x, y and Z. The reason I asked was because I got different 
 responses for consecutive simular requests. Now I believe it can be explained 
 by the bug described below. I guess they you do cross-collection/shard 
 searches, the request-handling Solr forwards the query to all involved 
 shards simultanious and merges sub-results into the final result as they are 
 returned from the shards. Because of the consider documents with same id as 
 the same document even though the come from different collections-bug it is 
 kinda random (depending on which shards responds first/last), for a given id, 
 what collection the document with that specific id is taken from. And if 
 documents with the same id from different collections has different timestamp 
 it is random where that document ends up in the final sorted result.
 So i believe this inconsistency can be explained by the bug described below.
 {quote}
 * A search returns a numFound-field telling how many documents all in all 
 matches the search-criteria, even though not all those documents are returned 
 by the search. It is a crazy question to ask, but I will do it anyway because 
 we actually see a problem with this. Isnt it correct that two searches which 
 only differs on the rows-number (documents to be returned) should always 
 return the same value for numFound?
 {quote}
 Well I found out myself what the problem is (or seems to be) - see:
 http://lucene.472066.n3.nabble.com/Changing-value-of-start-parameter-affects-numFound-td2460645.html
 http://lucene.472066.n3.nabble.com/numFound-inconsistent-for-different-rows-param-td3997269.html
 http://lucene.472066.n3.nabble.com/Solr-v3-5-0-numFound-changes-when-paging-through-results-on-8-shard-cluster-td3990400.html
 Until 4.0 this bug could be ignored because it was ok for a cross-shards 
 search to consider documents with identical id's as dublets and therefore 
 only returning/counting one of them. It is still, in 4.0, ok within the same 
 collection, but across collections identical id's should not be considered 
 dublicates and should not reduce documents returned/counted. So i believe 
 this feature has now become a bug in 4.0 when it comes to cross-collections 
 searches.
 {quote}
 Thanks!
 Regards, Steff
 {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5057) Hunspell stemmer generates multiple tokens

2013-09-06 Thread Lukas Vlcek (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760034#comment-13760034
]

Lukas Vlcek commented on LUCENE-5057:
-

Agree Chris. Thanks.

Hunspell stemmer generates multiple tokens
--

Key: LUCENE-5057
URL: https://issues.apache.org/jira/browse/LUCENE-5057
Project: Lucene - Core
Issue Type: Improvement
Affects Versions: 4.3
Reporter: Luca Cavanna
Assignee: Adrien Grand

The hunspell stemmer seems to be generating multiple tokens: the original
token plus the available stems.
It might be a good thing in some cases but it seems to be a different
behaviour compared to the other stemmers and causes problems as well. I would
rather have an option to decide whether it should output only the available
stems, or the stems plus the original token. I'm not sure though if it's
possible to have only a single stem indexed, which would be even better in my
opinion. When I look at how snowball works only one token is indexed, the
stem, and that works great. Probably there's something I'm missing in how
hunspell works.
Here is my issue: I have a query composed of multiple terms, which is
analyzed using stemming and a boolean query is generated out of it. All fine
when adding all clauses as should (OR operator), but if I add all clauses as
must (AND operator), then I can get back only the documents that contain the
stem originated by the exactly same original word.
Example for the dutch language I'm working with: fiets (means bicycle in
dutch), its plural is fietsen.
If I index fietsen I get both fietsen and fiets indexed, but if I index
fiets I get the only fiets indexed.
When I query for fietsen whatever I get the following boolean query:
field:fiets field:fietsen field:whatever.
If I apply the AND operator and use must clauses for each subquery, then I
can only find the documents that originally contained fietsen, not the ones
that originally contained fiets, which is not really what stemming is about.
Any thoughts on this? I also wonder if it can be a dictionary issue since I
see that different words that have the word fiets as root don't get the
same stems, and using the AND operator at query time is a big issue.
I would love to contribute on this and looking forward to your feedback.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5201) UIMAUpdateRequestProcessor should reuse the AnalysisEngine

2013-09-06 Thread Jun Ohtani (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760038#comment-13760038
 ] 

Jun Ohtani commented on SOLR-5201:
--

Thanks Tommaso.

Sorry, I misunderstood about the relationship between 
UIMAUpdateRequestProcessorFactory and AnalysisEngine.
My co-woker use this patch, it work without problems.

Do you commit the above patch to branch_4x?



 UIMAUpdateRequestProcessor should reuse the AnalysisEngine
 --

 Key: SOLR-5201
 URL: https://issues.apache.org/jira/browse/SOLR-5201
 Project: Solr
  Issue Type: Improvement
  Components: contrib - UIMA
Affects Versions: 4.4
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 4.5, 5.0

 Attachments: SOLR-5201-ae-cache-every-request_branch_4x.patch, 
 SOLR-5201-ae-cache-only-single-request_branch_4x.patch


 As reported in http://markmail.org/thread/2psiyl4ukaejl4fx 
 UIMAUpdateRequestProcessor instantiates an AnalysisEngine for each request 
 which is bad for performance therefore it'd be nice if such AEs could be 
 reused whenever that's possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight

2013-09-06 Thread Simon Willnauer (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760081#comment-13760081
 ] 

Simon Willnauer commented on LUCENE-4734:
-

bq. The real question is: does it make more sense to invest time in LUCENE-2878 
rather than further complicating FVH? FVH works great for simple phrase and 
single term queries but it has so many corner cases..

+1 lets do it

+1 to revert the change!

 FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
 

 Key: LUCENE-4734
 URL: https://issues.apache.org/jira/browse/LUCENE-4734
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.0, 4.1, 5.0
Reporter: Ryan Lauck
Assignee: Adrien Grand
  Labels: fastvectorhighlighter, highlighter
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch


 If a proximity phrase query overlaps with any other query term it will not be 
 highlighted.
 Example Text:  A B C D E F G
 Example Queries: 
 B E~10 D
 (D will be highlighted instead of B C D E)
 B E~10 C F~10
 (nothing will be highlighted)
 This can be traced to the FieldPhraseList constructor's inner while loop. 
 From the first example query, the first TermInfo popped off the stack will be 
 B. The second TermInfo will be D which will not be found in the submap 
 for B E~10 and will trigger a failed match.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760102#comment-13760102
 ] 

ASF subversion and git services commented on LUCENE-5101:
-

Commit 1520525 from [~jpountz] in branch 'dev/trunk'
[ https://svn.apache.org/r1520525 ]

LUCENE-5101: Make it easier to plugin different bitset implementations to 
CachingWrapperFilter.

 make it easier to plugin different bitset implementations to 
 CachingWrapperFilter
 -

 Key: LUCENE-5101
 URL: https://issues.apache.org/jira/browse/LUCENE-5101
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, 
 LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch


 Currently this is possible, but its not so friendly:
 {code}
   protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) 
 throws IOException {
 if (docIdSet == null) {
   // this is better than returning null, as the nonnull result can be 
 cached
   return EMPTY_DOCIDSET;
 } else if (docIdSet.isCacheable()) {
   return docIdSet;
 } else {
   final DocIdSetIterator it = docIdSet.iterator();
   // null is allowed to be returned by iterator(),
   // in this case we wrap with the sentinel set,
   // which is cacheable.
   if (it == null) {
 return EMPTY_DOCIDSET;
   } else {
 /* INTERESTING PART */
 final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
 bits.or(it);
 return bits;
 /* END INTERESTING PART */
   }
 }
   }
 {code}
 Is there any value to having all this other logic in the protected API? It 
 seems like something thats not useful for a subclass... Maybe this stuff can 
 become final, and INTERESTING PART calls a simpler method, something like:
 {code}
 protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) {
   final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
   bits.or(iterator);
   return bits;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5217) CachedSqlEntity fails with stored procedure

2013-09-06 Thread Hardik Upadhyay (JIRA)

Hardik Upadhyay created SOLR-5217:
-

 Summary: CachedSqlEntity fails with stored procedure
 Key: SOLR-5217
 URL: https://issues.apache.org/jira/browse/SOLR-5217
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Reporter: Hardik Upadhyay


When using DIH with CachedSqlEntityProcessor and importing data from MS-sql 
using stored procedures, it imports data for nested entities only once and then 
every call with different arguments for nested entities are only served from 
cache.My db-data-config is attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5217) CachedSqlEntity fails with stored procedure

2013-09-06 Thread Hardik Upadhyay (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hardik Upadhyay updated SOLR-5217:
--

Attachment: db-data-config.xml

 CachedSqlEntity fails with stored procedure
 ---

 Key: SOLR-5217
 URL: https://issues.apache.org/jira/browse/SOLR-5217
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Reporter: Hardik Upadhyay
 Attachments: db-data-config.xml


 When using DIH with CachedSqlEntityProcessor and importing data from MS-sql 
 using stored procedures, it imports data for nested entities only once and 
 then every call with different arguments for nested entities are only served 
 from cache.My db-data-config is attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-trunk - Build # 372 - Failure

2013-09-06 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-trunk/372/

1 tests failed.
REGRESSION:  org.apache.lucene.index.TestRollingUpdates.testRollingUpdates

Error Message:
Java heap space

Stack Trace:
java.lang.OutOfMemoryError: Java heap space
at 
__randomizedtesting.SeedInfo.seed([5535725D2A0C4F09:DBCE73249992EAC2]:0)
at org.apache.lucene.util.fst.BytesStore.init(BytesStore.java:62)
at org.apache.lucene.util.fst.FST.init(FST.java:366)
at org.apache.lucene.util.fst.FST.init(FST.java:301)
at 
org.apache.lucene.codecs.memory.MemoryPostingsFormat$TermsReader.init(MemoryPostingsFormat.java:799)
at 
org.apache.lucene.codecs.memory.MemoryPostingsFormat.fieldsProducer(MemoryPostingsFormat.java:861)
at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.init(PerFieldPostingsFormat.java:194)
at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:233)
at 
org.apache.lucene.index.SegmentCoreReaders.init(SegmentCoreReaders.java:128)
at org.apache.lucene.index.SegmentReader.init(SegmentReader.java:56)
at 
org.apache.lucene.index.ReadersAndLiveDocs.getReader(ReadersAndLiveDocs.java:111)
at 
org.apache.lucene.index.ReadersAndLiveDocs.getReadOnlyClone(ReadersAndLiveDocs.java:166)
at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:97)
at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:377)
at 
org.apache.lucene.index.TestRollingUpdates.testRollingUpdates(TestRollingUpdates.java:113)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)




Build Log:
[...truncated 282 lines...]
   [junit4] Suite: org.apache.lucene.index.TestRollingUpdates
   [junit4]   2 NOTE: download the large Jenkins line-docs file by running 
'ant get-jenkins-line-docs' in the lucene directory.
   [junit4]   2 NOTE: reproduce with: ant test  -Dtestcase=TestRollingUpdates 
-Dtests.method=testRollingUpdates -Dtests.seed=5535725D2A0C4F09 
-Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/hudson/lucene-data/enwiki.random.lines.txt 
-Dtests.locale=cs -Dtests.timezone=Etc/GMT -Dtests.file.encoding=US-ASCII
   [junit4] ERROR   21.9s J0 | TestRollingUpdates.testRollingUpdates 
   [junit4] Throwable #1: java.lang.OutOfMemoryError: Java heap space
   [junit4]at 
__randomizedtesting.SeedInfo.seed([5535725D2A0C4F09:DBCE73249992EAC2]:0)
   [junit4]at 
org.apache.lucene.util.fst.BytesStore.init(BytesStore.java:62)
   [junit4]at org.apache.lucene.util.fst.FST.init(FST.java:366)
   [junit4]at org.apache.lucene.util.fst.FST.init(FST.java:301)
   [junit4]at 
org.apache.lucene.codecs.memory.MemoryPostingsFormat$TermsReader.init(MemoryPostingsFormat.java:799)
   [junit4]at 
org.apache.lucene.codecs.memory.MemoryPostingsFormat.fieldsProducer(MemoryPostingsFormat.java:861)
   [junit4]at 
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.init(PerFieldPostingsFormat.java:194)
   [junit4]at

[jira] [Commented] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760104#comment-13760104
 ] 

ASF subversion and git services commented on LUCENE-5101:
-

Commit 1520527 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1520527 ]

LUCENE-5101: Make it easier to plugin different bitset implementations to 
CachingWrapperFilter.

 make it easier to plugin different bitset implementations to 
 CachingWrapperFilter
 -

 Key: LUCENE-5101
 URL: https://issues.apache.org/jira/browse/LUCENE-5101
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, 
 LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch


 Currently this is possible, but its not so friendly:
 {code}
   protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) 
 throws IOException {
 if (docIdSet == null) {
   // this is better than returning null, as the nonnull result can be 
 cached
   return EMPTY_DOCIDSET;
 } else if (docIdSet.isCacheable()) {
   return docIdSet;
 } else {
   final DocIdSetIterator it = docIdSet.iterator();
   // null is allowed to be returned by iterator(),
   // in this case we wrap with the sentinel set,
   // which is cacheable.
   if (it == null) {
 return EMPTY_DOCIDSET;
   } else {
 /* INTERESTING PART */
 final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
 bits.or(it);
 return bits;
 /* END INTERESTING PART */
   }
 }
   }
 {code}
 Is there any value to having all this other logic in the protected API? It 
 seems like something thats not useful for a subclass... Maybe this stuff can 
 become final, and INTERESTING PART calls a simpler method, something like:
 {code}
 protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) {
   final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
   bits.or(iterator);
   return bits;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5101) make it easier to plugin different bitset implementations to CachingWrapperFilter

2013-09-06 Thread Adrien Grand (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5101.
--

   Resolution: Fixed
Fix Version/s: 4.5
   5.0

Committed, thanks Robert!

 make it easier to plugin different bitset implementations to 
 CachingWrapperFilter
 -

 Key: LUCENE-5101
 URL: https://issues.apache.org/jira/browse/LUCENE-5101
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Fix For: 5.0, 4.5

 Attachments: DocIdSetBenchmark.java, LUCENE-5101.patch, 
 LUCENE-5101.patch, LUCENE-5101.patch, LUCENE-5101.patch


 Currently this is possible, but its not so friendly:
 {code}
   protected DocIdSet docIdSetToCache(DocIdSet docIdSet, AtomicReader reader) 
 throws IOException {
 if (docIdSet == null) {
   // this is better than returning null, as the nonnull result can be 
 cached
   return EMPTY_DOCIDSET;
 } else if (docIdSet.isCacheable()) {
   return docIdSet;
 } else {
   final DocIdSetIterator it = docIdSet.iterator();
   // null is allowed to be returned by iterator(),
   // in this case we wrap with the sentinel set,
   // which is cacheable.
   if (it == null) {
 return EMPTY_DOCIDSET;
   } else {
 /* INTERESTING PART */
 final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
 bits.or(it);
 return bits;
 /* END INTERESTING PART */
   }
 }
   }
 {code}
 Is there any value to having all this other logic in the protected API? It 
 seems like something thats not useful for a subclass... Maybe this stuff can 
 become final, and INTERESTING PART calls a simpler method, something like:
 {code}
 protected DocIdSet cacheImpl(DocIdSetIterator iterator, AtomicReader reader) {
   final FixedBitSet bits = new FixedBitSet(reader.maxDoc());
   bits.or(iterator);
   return bits;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760118#comment-13760118
 ] 

ASF subversion and git services commented on LUCENE-4734:
-

Commit 1520536 from [~jpountz] in branch 'dev/trunk'
[ https://svn.apache.org/r1520536 ]

Revert LUCENE-4734.

 FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
 

 Key: LUCENE-4734
 URL: https://issues.apache.org/jira/browse/LUCENE-4734
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.0, 4.1, 5.0
Reporter: Ryan Lauck
Assignee: Adrien Grand
  Labels: fastvectorhighlighter, highlighter
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch


 If a proximity phrase query overlaps with any other query term it will not be 
 highlighted.
 Example Text:  A B C D E F G
 Example Queries: 
 B E~10 D
 (D will be highlighted instead of B C D E)
 B E~10 C F~10
 (nothing will be highlighted)
 This can be traced to the FieldPhraseList constructor's inner while loop. 
 From the first example query, the first TermInfo popped off the stack will be 
 B. The second TermInfo will be D which will not be found in the submap 
 for B E~10 and will trigger a failed match.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4734) FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760138#comment-13760138
 ] 

ASF subversion and git services commented on LUCENE-4734:
-

Commit 1520544 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1520544 ]

Revert LUCENE-4734.

 FastVectorHighlighter Overlapping Proximity Queries Do Not Highlight
 

 Key: LUCENE-4734
 URL: https://issues.apache.org/jira/browse/LUCENE-4734
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Affects Versions: 4.0, 4.1, 5.0
Reporter: Ryan Lauck
Assignee: Adrien Grand
  Labels: fastvectorhighlighter, highlighter
 Fix For: 5.0, 4.5

 Attachments: LUCENE-4734-2.patch, lucene-4734.patch, LUCENE-4734.patch


 If a proximity phrase query overlaps with any other query term it will not be 
 highlighted.
 Example Text:  A B C D E F G
 Example Queries: 
 B E~10 D
 (D will be highlighted instead of B C D E)
 B E~10 C F~10
 (nothing will be highlighted)
 This can be traced to the FieldPhraseList constructor's inner while loop. 
 From the first example query, the first TermInfo popped off the stack will be 
 B. The second TermInfo will be D which will not be found in the submap 
 for B E~10 and will trigger a failed match.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-5202:
--

Attachment: SOLR-5202.patch

 Support easier overrides of Carrot2 clustering attributes via XML data sets 
 exported from the Workbench.
 

 Key: SOLR-5202
 URL: https://issues.apache.org/jira/browse/SOLR-5202
 Project: Solr
  Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0

 Attachments: SOLR-5202.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760145#comment-13760145
 ] 

Dawid Weiss commented on SOLR-5202:
---

Todo: the example should come with the default Carrot2 algorithms preconfigured 
(by name) and with
sensible default attribute XMLs. The benefit is twofold - better out-of-the-box 
source for copy-pasting and a clear indication where overridden resources must 
be located.

We should provide the defaults for Lingo, STC and kmeans perhaps.

 Support easier overrides of Carrot2 clustering attributes via XML data sets 
 exported from the Workbench.
 

 Key: SOLR-5202
 URL: https://issues.apache.org/jira/browse/SOLR-5202
 Project: Solr
  Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0

 Attachments: SOLR-5202.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread Dawid Weiss (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760145#comment-13760145
]

Dawid Weiss edited comment on SOLR-5202 at 9/6/13 11:36 AM:

Todo: the example should come with the default Carrot2 algorithms preconfigured
(by name) and with
sensible default attribute XMLs. The benefit is twofold - better out-of-the-box
source for copy-pasting and a clear indication where overridden resources must
be located.

We should provide the defaults for Lingo, STC and kmeans perhaps.

Another thing is that LEXICAL_RESOURCES_DIR no longer reflects the true purpose
of that folder... perhaps it should be aliased to something more sensible.

was (Author: dweiss):
Todo: the example should come with the default Carrot2 algorithms
preconfigured (by name) and with
sensible default attribute XMLs. The benefit is twofold - better out-of-the-box
source for copy-pasting and a clear indication where overridden resources must
be located.

We should provide the defaults for Lingo, STC and kmeans perhaps.

Support easier overrides of Carrot2 clustering attributes via XML data sets
exported from the Workbench.

Key: SOLR-5202
URL: https://issues.apache.org/jira/browse/SOLR-5202
Project: Solr
Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Fix For: 4.5, 5.0

Attachments: SOLR-5202.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-09-06 Thread Han Jiang (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760160#comment-13760160
]

Han Jiang commented on LUCENE-3069:
---

Mike, thanks for the review!

bq. In general, couldn't the writer re-use the reader's TermState?

I'm afraid this somewhat makes codes longer? I'll make a patch to see this.

{quote}
Have you run first do no harm perf tests? Ie, compare current trunk
w/ default Codec to branch w/ default Codec? Just to make sure there
are no surprises...
{quote}

Yes, no surprise yet.

bq. Why does Lucene41PostingsWriter have impersonation code?

Yeah, these should be removed.

{quote}
I forget: why does the postings reader/writer need to handle delta
coding again (take an absolute boolean argument)? Was it because of
pulsing or sep? It's fine for now (progress not perfection) ... but
not clean, since delta coding is really an encoding detail so in
theory the terms dict should own that ...
{quote}

Ah, yes, because of pulsing.

This is because.. PulsingPostingsBase is more than a PostingsBaseFormat.
It somewhat acts like a term dict, e.g. it needs to understand how terms are
structured in one block (term No.1 uses absolute value, term No.x use delta
value)
then judge how to restruct the inlined and wrapped block (No.1 still uses
absolute value,
but the first-non-pulsed term will need absolute encoding as well).

Without the argument 'absolute', the real term dictionary will do the delta
encoding itself,
then PulsingPostingsBase will be confused, and all wrapped PostingsBase have to
encode
metadata values without delta-format.

{quote}
The new .smy file for Pulsing is sort of strange ... but necessary
since it always uses 0 longs, so we have to store this somewhere
... you could put it into FieldInfo attributes instead?
{quote}

Yeah, it is another hairy thing... the reason is, we don't have a
'PostingsTrailer'
for PostingsBaseFormat. Pulsing will not know the longs size for each field,
until
all the fields are consumed... and it should not write those longsSize to
termsOut in close()
since the term dictionary will use the DirTrailer hack here. (maybe every term
dictionary
should close postingsWriter first, then write field summary and close itself?
I'm not sure
though).

bq. Should we backport this to 4.x?

Yeah, OK!

Lucene should have an entirely memory resident term dictionary
--

Key: LUCENE-3069
URL: https://issues.apache.org/jira/browse/LUCENE-3069
Project: Lucene - Core
Issue Type: Improvement
Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
Labels: gsoc2013
Fix For: 5.0, 4.5

Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch,
LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch,
LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch,
LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch,
LUCENE-3069.patch

FST based TermDictionary has been a great improvement yet it still uses a
delta codec file for scanning to terms. Some environments have enough memory
available to keep the entire FST based term dict in memory. We should add a
TermDictionary implementation that encodes all needed information for each
term into the FST (custom fst.Output) and builds a FST from the entire term
not just the delta.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2548) Multithreaded faceting

2013-09-06 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-2548:
-

Attachment: SOLR-2548.patch

Final patch, including CHANGES.txt entry.

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch


 Add multithreading support for faceting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-09-06 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760226#comment-13760226
 ] 

Erick Erickson commented on SOLR-4817:
--

bq: I think it's all a bit of a mess right now 

Yeah, it certainly is but I haven't had the energy to try to straighten it out 
either. Maybe we can share some of the work
 


 Solr should not fall back to the back compat built in solr.xml in SolrCloud 
 mode.
 -

 Key: SOLR-4817
 URL: https://issues.apache.org/jira/browse/SOLR-4817
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, 
 SOLR-4817.patch, SOLR-4817.patch


 A hard error is much more useful, and this built in solr.xml is not very good 
 for solrcloud - with the old style solr.xml with cores in it, you won't have 
 persistence and with the new style, it's not really ideal either.
 I think it makes it easier to debug solr.home to fail on this instead - but 
 just in solrcloud mode for now due to back compat. We might want to pull the 
 whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4817) Solr should not fall back to the back compat built in solr.xml in SolrCloud mode.

2013-09-06 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760221#comment-13760221
 ] 

Mark Miller commented on SOLR-4817:
---

I think it's all a bit of a mess right now (the test configs situation) - we 
should clean this up more. I intend to take a crack at it at some point. It's 
still too haphazard what is done in what tests and too difficult to understand 
and follow when writing new tests or debugging old ones. 

 Solr should not fall back to the back compat built in solr.xml in SolrCloud 
 mode.
 -

 Key: SOLR-4817
 URL: https://issues.apache.org/jira/browse/SOLR-4817
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Erick Erickson
Priority: Minor
 Fix For: 4.5, 5.0

 Attachments: SOLR-4817.patch, SOLR-4817.patch, SOLR-4817.patch, 
 SOLR-4817.patch, SOLR-4817.patch


 A hard error is much more useful, and this built in solr.xml is not very good 
 for solrcloud - with the old style solr.xml with cores in it, you won't have 
 persistence and with the new style, it's not really ideal either.
 I think it makes it easier to debug solr.home to fail on this instead - but 
 just in solrcloud mode for now due to back compat. We might want to pull the 
 whole internal solr.xml for 5.0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5024) java client(solrj 4.1.0) can not get the ngroup number.

2013-09-06 Thread Sandro Mario Zbinden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760239#comment-13760239
 ] 

Sandro Mario Zbinden commented on SOLR-5024:


This error exists too in Solr 4.2. 

 java client(solrj 4.1.0) can not get the ngroup number.
 ---

 Key: SOLR-5024
 URL: https://issues.apache.org/jira/browse/SOLR-5024
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.1
Reporter: sun
Priority: Minor
  Labels: none
   Original Estimate: 10m
  Remaining Estimate: 10m

 when adding these 
 parameters(group=truegroup.field=topicidgroup.ngroups=truegroup.format=simple
  ) to solrj,  i can not get the group number.
 it's easy to fix it.  at line 221 of queryresponse.java, an if-else should be 
 here, just like those from line 203 to 208.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.

2013-09-06 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5216:
--

Priority: Critical  (was: Major)

 Document updates to SolrCloud can cause a distributed deadlock.
 ---

 Key: SOLR-5216
 URL: https://issues.apache.org/jira/browse/SOLR-5216
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5, 5.0

 Attachments: SOLR-5216.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760259#comment-13760259
 ] 

ASF subversion and git services commented on LUCENE-3069:
-

Commit 1520592 from [~billy] in branch 'dev/branches/lucene3069'
[ https://svn.apache.org/r1520592 ]

LUCENE-3069: remove impersonate codes, fix typo

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-09-06 Thread Han Jiang (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760325#comment-13760325
 ] 

Han Jiang commented on LUCENE-3069:
---

I think this is ready to commit to trunk now, and I'll wait for a day or two 
before committing it. :)

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-09-06 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760328#comment-13760328
 ] 

Michael McCandless commented on LUCENE-3069:


Thanks Han.  I think we can just leave the .smy as is for now, and keep passing 
boolean absolute down.  We can later improve these ...

I think we should first land this on trunk and let jenkins chew on it for a 
while ... and if all seems good, then back port.

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5217) CachedSqlEntity fails with stored procedure

2013-09-06 Thread Shalin Shekhar Mangar (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760303#comment-13760303
 ] 

Shalin Shekhar Mangar commented on SOLR-5217:
-

I don't think this is a bug. CachedSqlEntityProcessor will execute the query 
only once and that is its USP. If you don't want the caching, then just use 
SqlEntityProcessor.

 CachedSqlEntity fails with stored procedure
 ---

 Key: SOLR-5217
 URL: https://issues.apache.org/jira/browse/SOLR-5217
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Reporter: Hardik Upadhyay
 Attachments: db-data-config.xml


 When using DIH with CachedSqlEntityProcessor and importing data from MS-sql 
 using stored procedures, it imports data for nested entities only once and 
 then every call with different arguments for nested entities are only served 
 from cache.My db-data-config is attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760304#comment-13760304
 ] 

ASF subversion and git services commented on LUCENE-3069:
-

Commit 1520618 from [~billy] in branch 'dev/branches/lucene3069'
[ https://svn.apache.org/r1520618 ]

LUCENE-3069: reuse customized TermState in PBF

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 5.0, 4.5

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option

2013-09-06 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5200.
-

   Resolution: Fixed
Fix Version/s: 4.5
   5.0

 HighFreqTerms has confusing behavior with -t option
 ---

 Key: LUCENE-5200
 URL: https://issues.apache.org/jira/browse/LUCENE-5200
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Reporter: Robert Muir
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5200.patch


 {code}
  * codeHighFreqTerms/code class extracts the top n most frequent terms
  * (by document frequency) from an existing Lucene index and reports their
  * document frequency.
  * p
  * If the -t flag is given, both document frequency and total tf (total
  * number of occurrences) are reported, ordered by descending total tf.
 {code}
 Problem #1:
 Its tricky what happens with -t: if you ask for the top-100 terms, it 
 requests the top-100 terms (by docFreq), then resorts the top-N by 
 totalTermFreq.
 So its not really the top 100 most frequently occurring terms.
 Problem #2: 
 Using the -t option can be confusing and slow: the reported docFreq includes 
 deletions, but totalTermFreq does not (it actually walks postings lists if 
 there is even one deletion).
 I think this is a relic from 3.x days when lucene did not support this 
 statistic. I think we should just always output both TermsEnum.docFreq() and 
 TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760296#comment-13760296
 ] 

ASF subversion and git services commented on LUCENE-5200:
-

Commit 1520616 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1520616 ]

LUCENE-5200: HighFreqTerms has confusing behavior with -t option

 HighFreqTerms has confusing behavior with -t option
 ---

 Key: LUCENE-5200
 URL: https://issues.apache.org/jira/browse/LUCENE-5200
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Reporter: Robert Muir
 Attachments: LUCENE-5200.patch


 {code}
  * codeHighFreqTerms/code class extracts the top n most frequent terms
  * (by document frequency) from an existing Lucene index and reports their
  * document frequency.
  * p
  * If the -t flag is given, both document frequency and total tf (total
  * number of occurrences) are reported, ordered by descending total tf.
 {code}
 Problem #1:
 Its tricky what happens with -t: if you ask for the top-100 terms, it 
 requests the top-100 terms (by docFreq), then resorts the top-N by 
 totalTermFreq.
 So its not really the top 100 most frequently occurring terms.
 Problem #2: 
 Using the -t option can be confusing and slow: the reported docFreq includes 
 deletions, but totalTermFreq does not (it actually walks postings lists if 
 there is even one deletion).
 I think this is a relic from 3.x days when lucene did not support this 
 statistic. I think we should just always output both TermsEnum.docFreq() and 
 TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Solr-Artifacts-4.x - Build # 402 - Failure

2013-09-06 Thread Robert Muir

java6 doesnt have this: i committed a fix.

On Fri, Sep 6, 2013 at 10:12 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Solr-Artifacts-4.x/402/

 No tests ran.

 Build Log:
 [...truncated 8808 lines...]
 [javac] Compiling 20 source files to 
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/build/misc/classes/java
 [javac] 
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140:
  cannot find symbol
 [javac] symbol  : method compare(int,int)
 [javac] location: class java.lang.Long
 [javac]   int res = Long.compare(a.docFreq, b.docFreq);
 [javac] ^
 [javac] 
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158:
  cannot find symbol
 [javac] symbol  : method compare(long,long)
 [javac] location: class java.lang.Long
 [javac]   int res = Long.compare(a.totalTermFreq, b.totalTermFreq);
 [javac] ^
 [javac] Note: Some input files use or override a deprecated API.
 [javac] Note: Recompile with -Xlint:deprecation for details.
 [javac] 2 errors

 BUILD FAILED
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/solr/common-build.xml:374:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:573:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:507:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:477:
  The following error occurred while executing this line:
 /usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:1625:
  Compile failed; see the compiler error output for details.

 Total time: 1 minute 14 seconds
 Build step 'Invoke Ant' marked build as failure
 Archiving artifacts
 Publishing Javadoc
 Email was triggered for: Failure
 Sending email for trigger: Failure




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #439: POMs out of sync

2013-09-06 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/439/

No tests ran.

Build Log:
[...truncated 3328 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Solr-Artifacts-4.x - Build # 402 - Failure

2013-09-06 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Solr-Artifacts-4.x/402/

No tests ran.

Build Log:
[...truncated 8808 lines...]
[javac] Compiling 20 source files to 
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/build/misc/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140:
 cannot find symbol
[javac] symbol  : method compare(int,int)
[javac] location: class java.lang.Long
[javac]   int res = Long.compare(a.docFreq, b.docFreq);
[javac] ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158:
 cannot find symbol
[javac] symbol  : method compare(long,long)
[javac] location: class java.lang.Long
[javac]   int res = Long.compare(a.totalTermFreq, b.totalTermFreq);
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 2 errors

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/solr/common-build.xml:374:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:573:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/module-build.xml:507:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:477:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Solr-Artifacts-4.x/lucene/common-build.xml:1625:
 Compile failed; see the compiler error output for details.

Total time: 1 minute 14 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Publishing Javadoc
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5200) HighFreqTerms has confusing behavior with -t option

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760294#comment-13760294
 ] 

ASF subversion and git services commented on LUCENE-5200:
-

Commit 1520615 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1520615 ]

LUCENE-5200: HighFreqTerms has confusing behavior with -t option

 HighFreqTerms has confusing behavior with -t option
 ---

 Key: LUCENE-5200
 URL: https://issues.apache.org/jira/browse/LUCENE-5200
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Reporter: Robert Muir
 Attachments: LUCENE-5200.patch


 {code}
  * codeHighFreqTerms/code class extracts the top n most frequent terms
  * (by document frequency) from an existing Lucene index and reports their
  * document frequency.
  * p
  * If the -t flag is given, both document frequency and total tf (total
  * number of occurrences) are reported, ordered by descending total tf.
 {code}
 Problem #1:
 Its tricky what happens with -t: if you ask for the top-100 terms, it 
 requests the top-100 terms (by docFreq), then resorts the top-N by 
 totalTermFreq.
 So its not really the top 100 most frequently occurring terms.
 Problem #2: 
 Using the -t option can be confusing and slow: the reported docFreq includes 
 deletions, but totalTermFreq does not (it actually walks postings lists if 
 there is even one deletion).
 I think this is a relic from 3.x days when lucene did not support this 
 statistic. I think we should just always output both TermsEnum.docFreq() and 
 TermsEnum.totalTermFreq(), and -t just determines the comparator of the PQ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5197) Add a method to SegmentReader to get the current index heap memory size

2013-09-06 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5197:


Attachment: LUCENE-5197.patch

Some minor cleanups / improvements:

Fixed calculations for all-in-ram DV impls: for the esoteric/deprecated ones, 
it just uses RUE rather than making the code complicated. Facet42 is easy 
though and accounts correctly now.

Added missing null check for VariableGapReader's FST (it can happen when there 
are no terms).


 Add a method to SegmentReader to get the current index heap memory size
 ---

 Key: LUCENE-5197
 URL: https://issues.apache.org/jira/browse/LUCENE-5197
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/codecs, core/index
Reporter: Areek Zillur
 Attachments: LUCENE-5197.patch, LUCENE-5197.patch, LUCENE-5197.patch, 
 LUCENE-5197.patch, LUCENE-5197.patch


 It would be useful to at least estimate the index heap size being used by 
 Lucene. Ideally a method exposing this information at the SegmentReader level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760427#comment-13760427
 ] 

ASF subversion and git services commented on SOLR-2548:
---

Commit 1520645 from [~erickoerickson] in branch 'dev/trunk'
[ https://svn.apache.org/r1520645 ]

SOLR-2548, Multithread faceting

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch


 Add multithreading support for faceting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 368 - Failure

2013-09-06 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/368/

All tests passed

Build Log:
[...truncated 3511 lines...]
[javac] Compiling 20 source files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build/misc/classes/java
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:140:
 cannot find symbol
[javac] symbol  : method compare(int,int)
[javac] location: class java.lang.Long
[javac]   int res = Long.compare(a.docFreq, b.docFreq);
[javac] ^
[javac] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java:158:
 cannot find symbol
[javac] symbol  : method compare(long,long)
[javac] location: class java.lang.Long
[javac]   int res = Long.compare(a.totalTermFreq, b.totalTermFreq);
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] 2 errors

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:409:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:382:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/build.xml:39:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/build.xml:551:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:1887:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/module-build.xml:58:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:477:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-NightlyTests-4.x/lucene/common-build.xml:1625:
 Compile failed; see the compiler error output for details.

Total time: 39 minutes 17 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2548) Multithreaded faceting

2013-09-06 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-2548.
--

   Resolution: Fixed
Fix Version/s: 5.0
   4.5

Thanks Janne and Gun!

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch


 Add multithreading support for faceting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.

2013-09-06 Thread Tim Vaillancourt (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760516#comment-13760516
 ] 

Tim Vaillancourt commented on SOLR-5216:


Hey guys,

We tested this patch and unfortunately encountered some serious issues a few 
hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing 
about 5000 docs/sec total, using autoCommit to commit the updates (no explicit 
commits).

Our environment:
* Solr 4.3.1 w/SOLR-5216 patch.
* Jetty 9, Java 1.7.
* 3 solr instances, 1 per physical server.
* 1 collection.
* 3 shards.
* 2 replicas (each instance is a leader and a replica).
* Soft autoCommit is 1000ms.
* Hard autoCommit is 15000ms.

After a few hours of this testing, we see many of these stalled transactions, 
and the solr instances start to see each other as down, flooding our solr logs 
with Connection Refused exceptions, and otherwise no useful logs (that I could 
see).

Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC
Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9

Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My 
script normalizes the ERROR-severity stack traces and returns them in order 
of ocurrance.

Summary of my solr.log: http://pastebin.com/pBdMAWeb

Thanks!

Tim Vaillancourt

 Document updates to SolrCloud can cause a distributed deadlock.
 ---

 Key: SOLR-5216
 URL: https://issues.apache.org/jira/browse/SOLR-5216
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5, 5.0

 Attachments: SOLR-5216.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5216) Document updates to SolrCloud can cause a distributed deadlock.

2013-09-06 Thread Tim Vaillancourt (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760516#comment-13760516
]

Tim Vaillancourt edited comment on SOLR-5216 at 9/6/13 7:01 PM:

Hey guys,

We tested this patch and unfortunately encountered some serious issues a few
hours of 500 update-batches/sec. Our update batch is 10 docs, so we are writing
about 5000 docs/sec total, using autoCommit to commit the updates (no explicit
commits).

Our environment:
* Solr 4.3.1 w/SOLR-5216 patch.
* Jetty 9, Java 1.7.
* 3 solr instances, 1 per physical server.
* 1 collection.
* 3 shards.
* 2 replicas (each instance is a leader and a replica).
* Soft autoCommit is 1000ms.
* Hard autoCommit is 15000ms.

After about 6 hours of stress-testing this patch, we see many of these stalled
transactions (below), and the Solr instances start to see each other as down,
flooding our Solr logs with Connection Refused exceptions, and otherwise no
obviously-useful logs that I could see.

I did notice some stalled transactions on both /select and /update, however.
This never occurred without this patch.

Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC
Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9

Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My
script normalizes the ERROR-severity stack traces and returns them in order
of occurrence.

Summary of my solr.log: http://pastebin.com/pBdMAWeb

Thanks!

Tim Vaillancourt

was (Author: tvaillancourt):
Hey guys,

After a few hours of this testing, we see many of these stalled transactions,
and the solr instances start to see each other as down, flooding our solr logs
with Connection Refused exceptions, and otherwise no useful logs (that I could
see).

Stack /select seems stalled on: http://pastebin.com/Y1NCrXGC
Stack /update seems stalled on: http://pastebin.com/cFLbC8Y9

Lastly, I have a summary of the ERROR-severity logs from this 24-hour soak. My
script normalizes the ERROR-severity stack traces and returns them in order
of ocurrance.

Summary of my solr.log: http://pastebin.com/pBdMAWeb

Thanks!

Tim Vaillancourt

Document updates to SolrCloud can cause a distributed deadlock.
---

Key: SOLR-5216
URL: https://issues.apache.org/jira/browse/SOLR-5216
Project: Solr
Issue Type: Bug
Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
Fix For: 4.5, 5.0

Attachments: SOLR-5216.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760528#comment-13760528
 ] 

ASF subversion and git services commented on SOLR-2548:
---

Commit 1520670 from [~erickoerickson] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1520670 ]

SOLR-2548, Multithread faceting

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch


 Add multithreading support for faceting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test

2013-09-06 Thread Steve Davids (JIRA)

Steve Davids created SOLR-5218:
--

 Summary: Unable to extend SolrJettyTestBase within a Parametrized 
test
 Key: SOLR-5218
 URL: https://issues.apache.org/jira/browse/SOLR-5218
 Project: Solr
  Issue Type: Bug
  Components: Tests
Affects Versions: 4.3.1
Reporter: Steve Davids
 Fix For: 4.5, 5.0


I would like to create a unit test that extends SolrJettyTestBase using the 
JUnit Parameterized test format. When I try to run the test I get the following 
messages:

Method beforeClass() should be public  Method afterClass() should be public
at java.lang.reflect.Constructor.newInstance(Unkown Source)...

Obviously it would be great if we could make those public so I can use the 
JUnit Runner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760569#comment-13760569
 ] 

ASF subversion and git services commented on SOLR-5202:
---

Commit 1520683 from [~dawidweiss] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1520683 ]

SOLR-5202: follow-up to CHANGES.txt

 Support easier overrides of Carrot2 clustering attributes via XML data sets 
 exported from the Workbench.
 

 Key: SOLR-5202
 URL: https://issues.apache.org/jira/browse/SOLR-5202
 Project: Solr
  Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0

 Attachments: SOLR-5202.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5219) Refactor selection of the default clustering algorithm

2013-09-06 Thread Dawid Weiss (JIRA)

Dawid Weiss created SOLR-5219:
-

 Summary: Refactor selection of the default clustering algorithm
 Key: SOLR-5219
 URL: https://issues.apache.org/jira/browse/SOLR-5219
 Project: Solr
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.5, 5.0


This is currently quite messy: the user needs to explicitly name the 'default' 
algorithm. The logic should be:

1) if there's only one algorithm, it becomes the default,
2) if there's more than one algorithm, the first one becomes the default one.
3) for back-compat, if there is an algorithm called 'default', it does become 
the default one.

The code will simplify a great deal too.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved SOLR-5202.
---

Resolution: Fixed

 Support easier overrides of Carrot2 clustering attributes via XML data sets 
 exported from the Workbench.
 

 Key: SOLR-5202
 URL: https://issues.apache.org/jira/browse/SOLR-5202
 Project: Solr
  Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0

 Attachments: SOLR-5202.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760563#comment-13760563
 ] 

ASF subversion and git services commented on SOLR-5202:
---

Commit 1520678 from [~dawidweiss] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1520678 ]

SOLR-5202: Support easier overrides of Carrot2 clustering attributes via XML 
data sets exported from the Workbench. Polished clustering configuration 
examples.

 Support easier overrides of Carrot2 clustering attributes via XML data sets 
 exported from the Workbench.
 

 Key: SOLR-5202
 URL: https://issues.apache.org/jira/browse/SOLR-5202
 Project: Solr
  Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0

 Attachments: SOLR-5202.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test

2013-09-06 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760577#comment-13760577
 ] 

Dawid Weiss commented on SOLR-5218:
---

We use a runner that does not follow all of JUnit conventions (and there are 
reason why it doesn't). JUnit requires all hooks to be public methods but this 
leads to accidental overrides and missed super calls. In RandomizedRunner a 
private hook is always called, regardless of the shadowing/ override.

If you want to use a parameterized test, use RandomizedRunner's factory 
instead, as is shown here:
https://github.com/carrotsearch/randomizedtesting/blob/master/examples/maven/src/main/java/com/carrotsearch/examples/randomizedrunner/Test007ParameterizedTests.java

 Unable to extend SolrJettyTestBase within a Parametrized test
 -

 Key: SOLR-5218
 URL: https://issues.apache.org/jira/browse/SOLR-5218
 Project: Solr
  Issue Type: Bug
  Components: Tests
Affects Versions: 4.3.1
Reporter: Steve Davids
 Fix For: 4.5, 5.0


 I would like to create a unit test that extends SolrJettyTestBase using the 
 JUnit Parameterized test format. When I try to run the test I get the 
 following messages:
 Method beforeClass() should be public  Method afterClass() should be public
 at java.lang.reflect.Constructor.newInstance(Unkown Source)...
 Obviously it would be great if we could make those public so I can use the 
 JUnit Runner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5218) Unable to extend SolrJettyTestBase within a Parametrized test

2013-09-06 Thread Dawid Weiss (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved SOLR-5218.
---

Resolution: Won't Fix
  Assignee: Dawid Weiss

 Unable to extend SolrJettyTestBase within a Parametrized test
 -

 Key: SOLR-5218
 URL: https://issues.apache.org/jira/browse/SOLR-5218
 Project: Solr
  Issue Type: Bug
  Components: Tests
Affects Versions: 4.3.1
Reporter: Steve Davids
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0


 I would like to create a unit test that extends SolrJettyTestBase using the 
 JUnit Parameterized test format. When I try to run the test I get the 
 following messages:
 Method beforeClass() should be public  Method afterClass() should be public
 at java.lang.reflect.Constructor.newInstance(Unkown Source)...
 Obviously it would be great if we could make those public so I can use the 
 JUnit Runner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760560#comment-13760560
 ] 

ASF subversion and git services commented on SOLR-5202:
---

Commit 1520677 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1520677 ]

SOLR-5202: Support easier overrides of Carrot2 clustering attributes via XML 
data sets exported from the Workbench. Polished clustering configuration 
examples.

 Support easier overrides of Carrot2 clustering attributes via XML data sets 
 exported from the Workbench.
 

 Key: SOLR-5202
 URL: https://issues.apache.org/jira/browse/SOLR-5202
 Project: Solr
  Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0

 Attachments: SOLR-5202.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5202) Support easier overrides of Carrot2 clustering attributes via XML data sets exported from the Workbench.

2013-09-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760566#comment-13760566
 ] 

ASF subversion and git services commented on SOLR-5202:
---

Commit 1520681 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1520681 ]

SOLR-5202: follow-up to CHANGES.txt

 Support easier overrides of Carrot2 clustering attributes via XML data sets 
 exported from the Workbench.
 

 Key: SOLR-5202
 URL: https://issues.apache.org/jira/browse/SOLR-5202
 Project: Solr
  Issue Type: New Feature
Reporter: Dawid Weiss
Assignee: Dawid Weiss
 Fix For: 4.5, 5.0

 Attachments: SOLR-5202.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.7.0_25) - Build # 7347 - Still Failing!

2013-09-06 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7347/
Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC

All tests passed

Build Log:
[...truncated 31814 lines...]
BUILD FAILED
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:396: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:335: The following 
error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:66: The 
following error occurred while executing this line:
/mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:139: The 
following files are missing svn:eol-style (or binary svn:mime-type):
* 
./solr/contrib/clustering/src/test-files/clustering/solr/collection1/conf/clustering/carrot2/mock-external-attrs-attributes.xml
* ./solr/example/solr/collection1/conf/clustering/carrot2/default-attributes.xml
* ./solr/example/solr/collection1/conf/clustering/carrot2/kmeans-attributes.xml
* ./solr/example/solr/collection1/conf/clustering/carrot2/stc-attributes.xml

Total time: 51 minutes 32 seconds
Build step 'Invoke Ant' marked build as failure
Description set: Java: 32bit/jdk1.7.0_25 -client -XX:+UseConcMarkSweepGC
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-Tests-trunk-Java7 - Build # 4296 - Failure

2013-09-06 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Tests-trunk-Java7/4296/

All tests passed

Build Log:
[...truncated 35271 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:396:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/build.xml:335:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:66:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-trunk-Java7/extra-targets.xml:139:
 The following files are missing svn:eol-style (or binary svn:mime-type):
* 
./solr/contrib/clustering/src/test-files/clustering/solr/collection1/conf/clustering/carrot2/mock-external-attrs-attributes.xml
* ./solr/example/solr/collection1/conf/clustering/carrot2/default-attributes.xml
* ./solr/example/solr/collection1/conf/clustering/carrot2/kmeans-attributes.xml
* ./solr/example/solr/collection1/conf/clustering/carrot2/stc-attributes.xml

Total time: 82 minutes 19 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5220) Marking server as zombie due to 4xx response is odd

2013-09-06 Thread Jessica Cheng (JIRA)

Jessica Cheng created SOLR-5220:
---

 Summary: Marking server as zombie due to 4xx response is odd
 Key: SOLR-5220
 URL: https://issues.apache.org/jira/browse/SOLR-5220
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.4
Reporter: Jessica Cheng


In LBHttpSolrServer.request, a request is retried and server marked as zombie 
if the return code is 404, 403, 503, or 500, and the comment says we retry on 
404 or 403 or 503 - you can see this on solr shutdown. I think returning a 503 
on a shutdown is reasonable, but not 4xx, which is supposed to be a client 
error. But even if this is can't be fixed systematically on the server-side, 
seems like on the client side we can retry on another server, but not mark the 
current server as dead, because most likely when the server returns a 403 
(Forbidden) or 404 (Not Found), it's not because it's dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager

2013-09-06 Thread Feihong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760815#comment-13760815
 ] 

Feihong Huang edited comment on SOLR-5215 at 9/7/13 12:45 AM:
--

Thanks to Ricard to find the reason. 
I also encounter this issue in our production application servers.


  was (Author: ainihong001):
Thanks to Ricard to finding the reason. 
I also encounter this issue in our production application servers.

  
 Deadlock in Solr Cloud ConnectionManager
 

 Key: SOLR-5215
 URL: https://issues.apache.org/jira/browse/SOLR-5215
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.2.1
 Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 
 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_18
 Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
Reporter: Ricardo Merizalde

 We are constantly seeing a deadlocks in our production application servers.
 The problem seems to be that a thread A:
 - tries to process an event and acquires the ConnectionManager lock
 - the update callback acquires connectionUpdateLock and invokes 
 waitForConnected
 - waitForConnected tries to acquire the ConnectionManager lock (which already 
 has)
 - waitForConnected calls wait and release the ConnectionManager lock (but 
 still has the connectionUpdateLock)
 The a thread B:
 - tries to process an event and acquires the ConnectionManager lock
 - the update call back tries to acquire connectionUpdateLock but gets blocked 
 holding the ConnectionManager lock and preventing thread A from getting out 
 of the wait state.
  
 Here is part of the thread dump:
 http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x59965800 
 nid=0x3e81 waiting for monitor entry [0x57169000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71)
 - waiting to lock 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 
 http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x5ad4 
 nid=0x3e67 waiting for monitor entry [0x4dbd4000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
 - waiting to lock 0x2aab1b0e0f78 (a java.lang.Object)
 at 
 org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
 at 
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
 - locked 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 
 http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x2aac4c2f7000 
 nid=0x3d9a waiting for monitor entry [0x42821000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at java.lang.Object.wait(Native Method)
 - waiting on 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165)
 - locked 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
 - locked 0x2aab1b0e0f78 (a java.lang.Object)
 at 
 org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
 at 
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
 - locked 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 
 Found one Java-level deadlock:
 =
 http-0.0.0.0-8080-82-EventThread:
   waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a 
 org.apache.solr.common.cloud.ConnectionManager),
   which is held by http-0.0.0.0-8080-82-EventThread
 http-0.0.0.0-8080-82-EventThread:
   waiting to lock monitor 0x2aac4c314978 (object 0x2aab1b0e0f78, a 
 java.lang.Object),
   which is held by

[jira] [Commented] (SOLR-5215) Deadlock in Solr Cloud ConnectionManager

2013-09-06 Thread Feihong Huang (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13760815#comment-13760815
 ] 

Feihong Huang commented on SOLR-5215:
-

Thanks to Ricard to finding the reason. 
I also encounter this issue in our production application servers.


 Deadlock in Solr Cloud ConnectionManager
 

 Key: SOLR-5215
 URL: https://issues.apache.org/jira/browse/SOLR-5215
 Project: Solr
  Issue Type: Bug
  Components: clients - java, SolrCloud
Affects Versions: 4.2.1
 Environment: Linux 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 
 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.6.0_18
 Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
Reporter: Ricardo Merizalde

 We are constantly seeing a deadlocks in our production application servers.
 The problem seems to be that a thread A:
 - tries to process an event and acquires the ConnectionManager lock
 - the update callback acquires connectionUpdateLock and invokes 
 waitForConnected
 - waitForConnected tries to acquire the ConnectionManager lock (which already 
 has)
 - waitForConnected calls wait and release the ConnectionManager lock (but 
 still has the connectionUpdateLock)
 The a thread B:
 - tries to process an event and acquires the ConnectionManager lock
 - the update call back tries to acquire connectionUpdateLock but gets blocked 
 holding the ConnectionManager lock and preventing thread A from getting out 
 of the wait state.
  
 Here is part of the thread dump:
 http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x59965800 
 nid=0x3e81 waiting for monitor entry [0x57169000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:71)
 - waiting to lock 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 
 http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x5ad4 
 nid=0x3e67 waiting for monitor entry [0x4dbd4000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
 - waiting to lock 0x2aab1b0e0f78 (a java.lang.Object)
 at 
 org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
 at 
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
 - locked 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 
 http-0.0.0.0-8080-82-EventThread daemon prio=10 tid=0x2aac4c2f7000 
 nid=0x3d9a waiting for monitor entry [0x42821000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at java.lang.Object.wait(Native Method)
 - waiting on 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:165)
 - locked 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:98)
 - locked 0x2aab1b0e0f78 (a java.lang.Object)
 at 
 org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:46)
 at 
 org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:91)
 - locked 0x2aab1b0e0ce0 (a 
 org.apache.solr.common.cloud.ConnectionManager)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 
 Found one Java-level deadlock:
 =
 http-0.0.0.0-8080-82-EventThread:
   waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a 
 org.apache.solr.common.cloud.ConnectionManager),
   which is held by http-0.0.0.0-8080-82-EventThread
 http-0.0.0.0-8080-82-EventThread:
   waiting to lock monitor 0x2aac4c314978 (object 0x2aab1b0e0f78, a 
 java.lang.Object),
   which is held by http-0.0.0.0-8080-82-EventThread
 http-0.0.0.0-8080-82-EventThread:
   waiting to lock monitor 0x5c7694b0 (object 0x2aab1b0e0ce0, a 
 org.apache.solr.common.cloud.ConnectionManager),
   which is held by

[jira] [Updated] (LUCENE-5198) Strengthen the function of Min should match, making it select BooleanClause as Occur.MUST according to the weight of query

2013-09-06 Thread HeXin (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

HeXin updated LUCENE-5198:
--

Description:
In current version, when we using BooleanQuery do disjunction, the top scorer
will select the doc which meet
at least mm numbers of sub scorers.

But in some case, we wish that the weight of sub scorers larger than the
threshold can be selected
as Occur.MUST automatically. The threshold can be configurable, equaling the
minimum integer by default.

Any comments is welcomed.

was:
In some case, we want the value of mm to select BooleanClause as Occur.MUST can
according to the weight of query.

Only if the weight larger than the threshold, it can be selected as Occur.MUST.
The threshold can be configurable, equaling the minimum integer by default.

Any comments is welcomed.

Strengthen the function of Min should match, making it select BooleanClause
as Occur.MUST according to the weight of query
--

Key: LUCENE-5198
URL: https://issues.apache.org/jira/browse/LUCENE-5198
Project: Lucene - Core
Issue Type: Improvement
Components: core/search
Affects Versions: 4.4
Reporter: HeXin
Priority: Trivial

In current version, when we using BooleanQuery do disjunction, the top scorer
will select the doc which meet
at least mm numbers of sub scorers.
But in some case, we wish that the weight of sub scorers larger than the
threshold can be selected
as Occur.MUST automatically. The threshold can be configurable, equaling the
minimum integer by default.
Any comments is welcomed.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-SmokeRelease-4.x - Build # 106 - Still Failing

2013-09-06 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-4.x/106/

No tests ran.

Build Log:
[...truncated 34242 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease
 [copy] Copying 416 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/lucene
 [copy] Copying 194 files to 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/solr
 [exec] JAVA6_HOME is /home/hudson/tools/java/latest1.6
 [exec] JAVA7_HOME is /home/hudson/tools/java/latest1.7
 [exec] NOTE: output encoding is US-ASCII
 [exec] 
 [exec] Load release URL 
file:/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeRelease/...
 [exec] 
 [exec] Test Lucene...
 [exec]   test basics...
 [exec]   get KEYS
 [exec] 0.1 MB in 0.01 sec (10.1 MB/sec)
 [exec]   check changes HTML...
 [exec]   download lucene-4.5.0-src.tgz...
 [exec] 27.1 MB in 0.04 sec (681.4 MB/sec)
 [exec] verify md5/sha1 digests
 [exec]   download lucene-4.5.0.tgz...
 [exec] 49.0 MB in 0.07 sec (690.3 MB/sec)
 [exec] verify md5/sha1 digests
 [exec]   download lucene-4.5.0.zip...
 [exec] 58.9 MB in 0.11 sec (516.1 MB/sec)
 [exec] verify md5/sha1 digests
 [exec]   unpack lucene-4.5.0.tgz...
 [exec] verify JAR/WAR metadata...
 [exec] test demo with 1.6...
 [exec]   got 5723 hits for query lucene
 [exec] test demo with 1.7...
 [exec]   got 5723 hits for query lucene
 [exec] check Lucene's javadoc JAR
 [exec] 
 [exec] 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/lucene/build/fakeReleaseTmp/unpack/lucene-4.5.0/docs/core/org/apache/lucene/util/AttributeSource.html
 [exec]   broken details HTML: Method Detail: addAttributeImpl: closing 
/code does not match opening T
 [exec]   broken details HTML: Method Detail: getAttribute: closing 
/code does not match opening T
 [exec] Traceback (most recent call last):
 [exec]   File 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py,
 line 1450, in module
 [exec] main()
 [exec]   File 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py,
 line 1394, in main
 [exec] smokeTest(baseURL, svnRevision, version, tmpDir, isSigned, 
testArgs)
 [exec]   File 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py,
 line 1431, in smokeTest
 [exec] unpackAndVerify('lucene', tmpDir, artifact, svnRevision, 
version, testArgs)
 [exec]   File 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py,
 line 607, in unpackAndVerify
 [exec] verifyUnpacked(project, artifact, unpackPath, svnRevision, 
version, testArgs)
 [exec]   File 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py,
 line 786, in verifyUnpacked
 [exec] checkJavadocpath('%s/docs' % unpackPath)
 [exec]   File 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/dev-tools/scripts/smokeTestRelease.py,
 line 904, in checkJavadocpath
 [exec] raise RuntimeError('missing javadocs package summaries!')
 [exec] RuntimeError: missing javadocs package summaries!

BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-SmokeRelease-4.x/build.xml:321:
 exec returned: 1

Total time: 19 minutes 30 seconds
Build step 'Invoke Ant' marked build as failure
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

57 matches

Mail list logo