date:20120417


 [ 
https://issues.apache.org/jira/browse/SOLR-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-2990:
-

Component/s: (was: clients - java)
 contrib - Solr Cell (Tika extraction)

 solr OOM issues
 ---

 Key: SOLR-2990
 URL: https://issues.apache.org/jira/browse/SOLR-2990
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Affects Versions: 4.0
 Environment: CentOS 5.x/6.x
 Solr Build apache-solr-4.0-2011-11-04_09-29-42 (includes tika 1.0)
 java -server -Xms2G -Xmx2G -XX:+HeapDumpOnOutOfMemoryError 
 -XX:HeapDumpPath=/var/log/oom/solr.dump.1 -Dsolr.data.dir=/opt/solr.data 
 -Djava.util.logging.config.file=solr-logging.properties -DSTOP.PORT=8907 
 -DSTOP.KEY=STOP -jar start.jar
Reporter: Rob Tulloh

 We see intermittent issues with OutOfMemory caused by tika failing to process 
 content. Here is an example:
 Dec 29, 2011 7:12:05 AM org.apache.solr.common.SolrException log
 SEVERE: java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.poi.hmef.attribute.TNEFAttribute.init(TNEFAttribute.java:50)
 at 
 org.apache.poi.hmef.attribute.TNEFAttribute.create(TNEFAttribute.java:76)
 at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:74)
 at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
 at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
 at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
 at org.apache.poi.hmef.HMEFMessage.init(HMEFMessage.java:63)
 at 
 org.apache.tika.parser.microsoft.TNEFParser.parse(TNEFParser.java:79)
 at 
 org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
 at 
 org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
 at 
 org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:129)
 at 
 org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:195)
 at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:244)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1478)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3365) Data import using local time to mark last_index_time

2012-04-17 Thread Bartosz Cembor (Created) (JIRA)

Data import using local time to mark last_index_time


 Key: SOLR-3365
 URL: https://issues.apache.org/jira/browse/SOLR-3365
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
 Environment: 1 mysql data source server
2 solr servers 
Reporter: Bartosz Cembor


Class org.apache.solr.handler.dataimport.DataImporter

setIndexStartTime(new Date());

When there is difference in time beetwen servers (mysql and solr) some 
documents are not indexed 

I think DataImporter should take time from mysql database (SELECT NOW()) and 
use it for mark start_index_time



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions


[ 
https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255433#comment-13255433
 ] 

Sami Siren commented on SOLR-3284:
--

What are the operations/error situations where you are not seeing an Exception 
when you expect one?

By default the ConcurrentUpdateSolrServer (StreamingUpdateSolrServer) just logs 
the exceptions from updates but you can override this functionality:

{code}
SolrServer server = new 
ConcurrentUpdateSolrServer(http://127.0.0.1:8983/solr;, 1, 1){
  public void handleError(Throwable ex) {
//do something with the Throwable here
System.out.println(Something wrong! + ex.getMessage());
  }
};

server.add(new SolrInputDocument());

{code}

The current exception reporting is pretty limited and it is impossible to see 
which operation triggered the exception but such improvements should be done in 
separate issues.

 StreamingUpdateSolrServer swallows exceptions
 -

 Key: SOLR-3284
 URL: https://issues.apache.org/jira/browse/SOLR-3284
 Project: Solr
  Issue Type: Improvement
  Components: clients - java
Affects Versions: 3.5, 4.0
Reporter: Shawn Heisey

 StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as 
 HttpClient, when doing adds.  It may happen with other methods, though I know 
 that query and deleteByQuery will throw exceptions.  I believe that this is a 
 result of the queue/Runner design.  That's what makes SUSS perform better, 
 but it means you sacrifice the ability to programmatically determine that 
 there was a problem with your update.  All errors are logged via slf4j, but 
 that's not terribly helpful except with determining what went wrong after the 
 fact.
 When using CommonsHttpSolrServer, I've been able to rely on getting an 
 exception thrown by pretty much any error, letting me use try/catch to detect 
 problems.
 There's probably enough dependent code out there that it would not be a good 
 idea to change the design of SUSS, unless there were alternate constructors 
 or additional methods available to configure new/old behavior.  Fixing this 
 is probably not trivial, so it's probably a better idea to come up with a new 
 server object based on CHSS.  This is outside my current skillset.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2483) DIH - an uppercase problem in query parameters


 [ 
https://issues.apache.org/jira/browse/SOLR-2483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-2483:
-

Component/s: (was: clients - java)

 DIH - an uppercase problem in query parameters
 --

 Key: SOLR-2483
 URL: https://issues.apache.org/jira/browse/SOLR-2483
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.1
 Environment: Windows Vista
 Java 1.6
Reporter: Lubo Torok
  Labels: DataImportHandler, entity, newdev, parameter, sql

 I have two tables called PROBLEM and KOMENTAR(means 'comment' in English) 
 in DB. One problem can have more comments. I want to index them all.
 schema.xml looks as follows
 ... some fields ...
  field name=problem_id type=string stored=true required=true/
 ... some fields...
 data-config.xml:
 document name=problemy
 entity name=problem query=select to_char(id) as problem_id, nazov as 
 problem_nazov, cislo as problem_cislo, popis as problem_popis from problem 
 pk=problem_id
   entity name=komentar query=select id as komentar_id, nazov as 
 komentar_nazov, text as komentar_text from komentar where 
 to_char(fk_problem)='${problem.PROBLEM_ID}'/   
 /entity  
   /document
 If you write '${problem.PROBLEM_ID}' in lower case, i.e. 
 '${problem.problem_id}' SOLR will not import the inner entity. Seems strange 
 to me and it took me some time to figure this out.
 Note that primary key in PROBLEM is called ID. I defined the alias 
 problem_id (yes,lower case) in SQL. In schema, there is this field defined 
 as problem_id again in lower case. But, when I run
 http://localhost:8983/solr/dataimport?command=full-importdebug=trueverbose=on
 so I can see some debug information there is this part
 ...
 lst name=verbose-output
 −
 lst name=entity:problem
 −
 lst name=document#1
 −
 str name=query
 select to_char(id) as problem_id, nazov as problem_nazov, cislo as 
 problem_cislo, popis as problem_popis from problem
 /str
 str name=time-taken0:0:0.465/str
 str--- row #1-/str
 str name=PROBLEM_NAZOVtest zodpovedneho/str
 str name=PROBLEM_ID2533274790395945/str
 str name=PROBLEM_CISLO201009304/str
 str name=PROBLEM_POPIScsfdewafedewfw/str
 str-/str
 −
 lst name=entity:komentar
 −
 str name=query
 select id as komentar_id, nazov as komentar_nazov, text as komentar_text from 
 komentar where to_char(fk_problem)='2533274790395945'
 /str
 ...
 where you can see that, internally, the fields of PROBLEM are represented 
 in uppercase despite the user (me) had not defined them this way. The result 
 is I guess that parameter referring to the parent entity ${entity.field} 
 should always be in uppercase, i.e. ${entity.FIELD}.
 Here is an example of the indexed entity as written after full-import command 
 with debug and verbose on:
 arr name=documents
 −
 lst
 −
 arr name=problem_nazov
 strtest zodpovedneho/str
 /arr
 −
 arr name=problem_id
 str2533274790395945/str
 /arr
 −
 arr name=problem_cislo
 str201009304/str
 /arr
 −
 arr name=problem_popis
 strcsfdewafedewfw/str
 /arr
 −
 arr name=komentar_id
 strjava.math.BigDecimal:5066549580791985/str
 /arr
 −
 arr name=komentar_text
 stra.TXT/str
 /arr
 /lst
 here are the field names in lower case. I consider this as a bug. Maybe I am 
 wrong and its a feature. I work with SOLR only for few days.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1888) Annotated beans source generation with maven plugin


 [ 
https://issues.apache.org/jira/browse/SOLR-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-1888:
-

Component/s: (was: clients - java)

 Annotated beans source generation with maven plugin
 ---

 Key: SOLR-1888
 URL: https://issues.apache.org/jira/browse/SOLR-1888
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.5
 Environment: java, maven
Reporter: Matthias Epheser
 Fix For: 4.0

 Attachments: maven-solr-plugin.zip


 As I stumbled over a lot of copy pasting while creating java annotated beans 
 representing a schema.xml, i decided to make a shortcut and create a maven 
 plugin.
 Think about it as source generation similar to castor/jaxb code generation 
 from an xsd. You just point to a schema.xml and connect to the 
 generate-sources phase. This leads to a java bean in 
 target/generated-sources/solr that contains all fields from the schema well 
 annotated. 
 The mapping reads the fields section and maps field=string to 
 solr.StringField to java.lang String etc. Multivalured fields generate lists, 
 dynamic fields Maps. Currently the code generation is plain simple, just a 
 fileWriter with some intends. The getValidJavaName(String name) may act more 
 professional than now.
 Just install the plugin contained in the zip using mvn install and connect it 
 to an existing solrj project:
 {{
 plugin
groupIdorg.apache.solr/groupId
artifactIdmaven-solr-plugin/artifactId
configuration
  schemaFiletest-data/solr/conf/schema.xml/schemaFile
  qualifiedNameorg.test.MyBean/qualifiedName
/configuration
executions
   execution
 phasegenerate-sources/phase
 goals
   goalgenerate/goal
 /goals
   /execution
 /executions
  /plugin
 }}
 The generated fiels will be automatically added to the classpath after the 
 first run of mvn generate/compile. So just execute mvn eclipse:eclipse once 
 after that. After every change in the schema, generate again and your bean 
 will be updated and fields and getters and setters will be present.
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-1212) TestNG Test Case


 [ 
https://issues.apache.org/jira/browse/SOLR-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-1212:
-

Component/s: (was: clients - java)

 TestNG Test Case 
 -

 Key: SOLR-1212
 URL: https://issues.apache.org/jira/browse/SOLR-1212
 Project: Solr
  Issue Type: New Feature
Affects Versions: 1.4
 Environment: Java 6
Reporter: Karthik K
 Fix For: 4.0

 Attachments: SOLR-1212.patch, testng-5.9-jdk15.jar

   Original Estimate: 1h
  Remaining Estimate: 1h

 TestNG equivalent of AbstractSolrTestCase , without using JUnit altogether . 
 New Class created: AbstractSolrNGTest 
 LICENSE.txt , NOTICE.txt modified as appropriate. ( TestNG under Apache 
 License 2.0 ) 
 TestNG 5.9-jdk15 added to lib. 
 Justification:  In some workplaces - people are moving towards TestNG and 
 take out JUnit altogether from the classpath. Hence useful in those cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Wildcard queries are not analyzed

2012-04-17 Thread Björn Kremer


Hello,


maybe I have found a little lucene problem: Wildcard queries are not 
analyzed correctly. I'm using the german analyzer with the 
'GermanDIN2Stemmer'.


In the lucene-index my name('Björn') is stored as 'bjorn'. If I performe 
a wildcard query like 'björ*' the function 'GetPrefixQuery' does not 
analyze the search term. So the query result is 'björ*' instead of 
'bjor*'. (björ* = no match, bjor* = match)



Thank You
Björn

[jira] [Updated] (SOLR-1196) Incorrect matches when using non alphanumeric search string !@#$%\^\\*


 [ 
https://issues.apache.org/jira/browse/SOLR-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-1196:
-

Component/s: (was: clients - java)

 Incorrect matches when using non alphanumeric search string !@#$%\^\\*\(\)
 ---

 Key: SOLR-1196
 URL: https://issues.apache.org/jira/browse/SOLR-1196
 Project: Solr
  Issue Type: Bug
Affects Versions: 1.3
 Environment: Solr 1.3/ Java 1.6/ Win XP/Eclipse 3.3
Reporter: Sam Michael

 When matching strings that do not include alphanumeric chars, all the data is 
 returned as matches. (There is actually no match, so nothing should be 
 returned.)
 When I run a query like  - (activity_type:NAME) AND title:(\!@#$%\^\*\(\)) 
 all the documents are returned even though there is not a single match. There 
 is no title that matches the string (which has been escaped).
 My document structure is as follows
 doc
 str name=activity_typeNAME/str
 str name=titleBathing/str
 
 /doc 
 The title field is of type text_title which is described below. 
 fieldType name=text_title class=solr.TextField 
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 
 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
 ignoreCase=true expand=true/
 filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 
 splitOnCaseChange=1/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType 
 -
 Yonik's analysis as follows.
 str name=rawquerystring-features:foo features:(\!@#$%\^\*\(\))/str
 str name=querystring-features:foo features:(\!@#$%\^\*\(\))/str
 str name=parsedquery-features:foo/str
 str name=parsedquery_toString-features:foo/str
 The text analysis is throwing away non alphanumeric chars (probably
 the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
 away term queries when the token is zero length (after analysis).
 Solr then interprets the left over -features:foo as all documents
 not containing foo in the features field, so you get a bunch of
 matches. 
 As per his suggestion, a bug is filed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

why the of advance(int target) function of DocIdSetIterator is defined with uncertain?

2012-04-17 Thread Li Li

hi all,
I am now hacking the BooleanScorer2 to let it keep the docID() of the
leaf scorer(mostly possible TermScorer) the same as the top-level Scorer.
Why I want to do this is: When I Collect a doc, I want to know which term
is matched(especially for BooleanClause whose Occur is SHOULD). we have
discussed some solutions, such as adding bit masks in disjunction scorers.
with this method, when we finds a matched doc, we can recursively find
which leaf scorer is matched. But we think it's not very efficient and not
convenient to use(this is my proposal but not agreed by others in our
team). and then we came up with another one: Modifying DisjunctionSumScorer.
   we analysed the codes and found that the only Scorers used by
BooleanScorer2 that will make the children scorers' docID() not equal to
parent is an anonymous class inherited from DisjunctionSumScorer. All other
ones including SingleMatchScorer, countingConjunctionSumScorer(anonymous),
dualConjuctionSumScorer, ReqOptSumScorer and ReqExclScorer are fit our need.
   The implementation algorithm of DisjunctionSumScorer use a heap to find
the smallest doc. after finding a matched doc, the currentDoc is the
matched doc and all the scorers in the heap will call nextDoc() so all of
the scorers' current docID the nextDoc of currentDoc. if there are N level
DisjunctionSumScorer, the leaf scorer's current doc is the n-th next docId
of the root of the scorer tree.
   So we modify the DisjuctionSumScorer and let it behavior as we expected.
And then I wrote some TestCase and it works well. And also I wrote some
random generated TermScorer and compared the nextDoc(),score() and
advance(int) method of original DisjunctionSumScorer and modified one.
nextDoc() and score() and exactly the same. But for advance(int target), we
found some interesting and strange things.
   at the beginning, I think if target is less than current docID, it will
just return current docID and do nothing. this assumption let my algorithm
go wrong. Then I read the codes of TermScorer and found each call of
advance(int) of TermScorer will call nextDoc() no matter whether current
docID is larger than target or not.
   So I am confused and then read the javadoc of DocIdSetIterator:
- javadoc of DocIdSetIterator.advance(int
target)-

int org.apache.lucene.search.DocIdSetIterator.advance(int target) throws
IOException

Advances to the first beyond (see NOTE below) the current whose document
number is greater than or equal
 to target. Returns the current document number or NO_MORE_DOCS if there
are no more docs in the set.
Behaves as if written:
 int advance(int target) {
   int doc;
   while ((doc = nextDoc())  target) {
   }
   return doc;
 }
 Some implementations are considerably more efficient than that.
NOTE: when target  current implementations may opt not to advance beyond
their current docID().
NOTE: this method may be called with NO_MORE_DOCS for efficiency by some
Scorers. If your
 implementation cannot efficiently determine that it should exhaust, it is
recommended that you check for
 that value in each call to this method.
NOTE: after the iterator has exhausted you should not call this method, as
it may result in unpredicted
 behavior.
--
Then I modified my algorithm again and found that
DisjunctionSumScorer.advance(int target) has some strange behavior. most of
the cases, it will return currentDoc if target  currentDoc. but in some
boundary condition, it will not.
it's not a bug but let me sad. I thought my algorithm has some bug because
it's advance method is not exactly the same as original
DisjunctionSumScorer's.
codes of DisjunctionSumScorer---
  @Override
  public int advance(int target) throws IOException {
if (scorerDocQueue.size()  minimumNrMatchers) {
  return currentDoc = NO_MORE_DOCS;
}
if (target = currentDoc) {
  return currentDoc;
}
   
---
for most case if (target = currentDoc) it will return currentDoc;
but if previous advance will make sub scorers exhausted, then if may return
NO_MORE_DOCS
an example is:
   currentDoc=-1
   minimumNrMatchers=1
   subScorers:
  TermScorer: docIds: [1, 2, 6]
  TermScorer: docIds: [2, 4]
after first call advance(5)
currentDoc=6
only first scorer is now in the heap, scorerDocQueue.size()==1
then call advance(6)
because scorerDocQueue.size()  minimumNrMatchers, it just return
NO_MORE_DOCS

My question is why the advance(int target) method is defined like this? for
the reason of efficient or any other reasons?

Re: why the of advance(int target) function of DocIdSetIterator is defined with uncertain?

2012-04-17 Thread Li Li

some mistakes of the example:
after first call advance(5)
currentDoc=6
first scorer's nextDoc is called to in advance, the heap is empty now.
then call advance(6)
because scorerDocQueue.size()  minimumNrMatchers, it just return
NO_MORE_DOCS

On Tue, Apr 17, 2012 at 6:37 PM, Li Li fancye...@gmail.com wrote:

 hi all,
 I am now hacking the BooleanScorer2 to let it keep the docID() of the
 leaf scorer(mostly possible TermScorer) the same as the top-level Scorer.
 Why I want to do this is: When I Collect a doc, I want to know which term
 is matched(especially for BooleanClause whose Occur is SHOULD). we have
 discussed some solutions, such as adding bit masks in disjunction scorers.
 with this method, when we finds a matched doc, we can recursively find
 which leaf scorer is matched. But we think it's not very efficient and not
 convenient to use(this is my proposal but not agreed by others in our
 team). and then we came up with another one: Modifying DisjunctionSumScorer.
we analysed the codes and found that the only Scorers used by
 BooleanScorer2 that will make the children scorers' docID() not equal to
 parent is an anonymous class inherited from DisjunctionSumScorer. All other
 ones including SingleMatchScorer, countingConjunctionSumScorer(anonymous),
 dualConjuctionSumScorer, ReqOptSumScorer and ReqExclScorer are fit our need.
The implementation algorithm of DisjunctionSumScorer use a heap to find
 the smallest doc. after finding a matched doc, the currentDoc is the
 matched doc and all the scorers in the heap will call nextDoc() so all of
 the scorers' current docID the nextDoc of currentDoc. if there are N level
 DisjunctionSumScorer, the leaf scorer's current doc is the n-th next docId
 of the root of the scorer tree.
So we modify the DisjuctionSumScorer and let it behavior as we
 expected. And then I wrote some TestCase and it works well. And also I
 wrote some random generated TermScorer and compared the nextDoc(),score()
 and advance(int) method of original DisjunctionSumScorer and modified one.
 nextDoc() and score() and exactly the same. But for advance(int target), we
 found some interesting and strange things.
at the beginning, I think if target is less than current docID, it will
 just return current docID and do nothing. this assumption let my algorithm
 go wrong. Then I read the codes of TermScorer and found each call of
 advance(int) of TermScorer will call nextDoc() no matter whether current
 docID is larger than target or not.
So I am confused and then read the javadoc of DocIdSetIterator:
 - javadoc of DocIdSetIterator.advance(int
 target)-

 int org.apache.lucene.search.DocIdSetIterator.advance(int target) throws
 IOException

 Advances to the first beyond (see NOTE below) the current whose document
 number is greater than or equal
  to target. Returns the current document number or NO_MORE_DOCS if there
 are no more docs in the set.
 Behaves as if written:
  int advance(int target) {
int doc;
while ((doc = nextDoc())  target) {
}
return doc;
  }
  Some implementations are considerably more efficient than that.
 NOTE: when target  current implementations may opt not to advance beyond
 their current docID().
 NOTE: this method may be called with NO_MORE_DOCS for efficiency by some
 Scorers. If your
  implementation cannot efficiently determine that it should exhaust, it is
 recommended that you check for
  that value in each call to this method.
 NOTE: after the iterator has exhausted you should not call this method, as
 it may result in unpredicted
  behavior.
 --
 Then I modified my algorithm again and found that
 DisjunctionSumScorer.advance(int target) has some strange behavior. most of
 the cases, it will return currentDoc if target  currentDoc. but in some
 boundary condition, it will not.
 it's not a bug but let me sad. I thought my algorithm has some bug because
 it's advance method is not exactly the same as original
 DisjunctionSumScorer's.
 codes of DisjunctionSumScorer---
   @Override
   public int advance(int target) throws IOException {
 if (scorerDocQueue.size()  minimumNrMatchers) {
   return currentDoc = NO_MORE_DOCS;
 }
 if (target = currentDoc) {
   return currentDoc;
 }

 ---
 for most case if (target = currentDoc) it will return currentDoc;
 but if previous advance will make sub scorers exhausted, then if may
 return NO_MORE_DOCS
 an example is:
currentDoc=-1
minimumNrMatchers=1
subScorers:
   TermScorer: docIds: [1, 2, 6]
   TermScorer: docIds: [2, 4]
 after first call advance(5)
 currentDoc=6
 only first scorer is now in the heap, scorerDocQueue.size()==1
 then call advance(6)
 because scorerDocQueue.size()  minimumNrMatchers, it just return
 NO_MORE_DOCS

 My question is why the advance(int target) method is defined like this?
 for the reason of efficient or any

[jira] [Updated] (SOLR-1976) Commit on StreamingUpdateSolrServer can happen before all previously added docs have been sent to Solr


 [ 
https://issues.apache.org/jira/browse/SOLR-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-1976:
-

Priority: Minor  (was: Major)

I tried to reproduce this with a testcase but so far after doing few thousand 
iterations of clean, add, commit, check result-size: no success

Stephen: can you provide a testcase that show this error?

 Commit on StreamingUpdateSolrServer can happen before all previously added 
 docs have been sent to Solr
 --

 Key: SOLR-1976
 URL: https://issues.apache.org/jira/browse/SOLR-1976
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4.1
Reporter: Stephen Duncan Jr
Priority: Minor

 Because of it's multi-threaded nature, calling commit on 
 StreamingUpdateSolrServer  can send the commit before all the added documents 
 have been sent to Solr.  Calling blockUntilFinished() does not change this.  
 It needs to be possible to send a commit that will commit all the documents 
 that have been added previously.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2024) StreamingUpdateSolrServer encounters ConcurrentModificationException


 [ 
https://issues.apache.org/jira/browse/SOLR-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-2024.
--

Resolution: Won't Fix

SOLR-1711 was fixed in 3.1 and 4.0. 

I think that there will be no more releases from 1.4.x branch so I am closing 
this as Won't Fix.

 StreamingUpdateSolrServer encounters ConcurrentModificationException
 

 Key: SOLR-2024
 URL: https://issues.apache.org/jira/browse/SOLR-2024
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4.1
Reporter: Reuben Firmin

 We are intermittently encountering this bug when using the 
 StreamingUpdateSolrServer. It appears to be caused by many near-simultaneous 
 requests to the add method. We were initially using 1.4 dev, but have since 
 updated to 1.4.1, and are still encountering the issue.
 java.util.ConcurrentModificationException
 at 
 java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
 at java.util.AbstractList$Itr.next(AbstractList.java:343)
 at 
 org.hibernate.collection.AbstractPersistentCollection$IteratorProxy.next(AbstractPersistentCollection.java:577)
 at 
 org.apache.solr.client.solrj.util.ClientUtils.writeXML(ClientUtils.java:105)
 at 
 org.apache.solr.client.solrj.request.UpdateRequest.writeXML(UpdateRequest.java:213)
 at 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner$1.writeRequest(StreamingUpdateSolrServer.java:100)
 at 
 org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:499)
 at 
 org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:2114)
 at 
 org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:1096)
 at 
 org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:398)
 at 
 org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
 at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
 at 
 org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
 at 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:137)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1976) Commit on StreamingUpdateSolrServer can happen before all previously added docs have been sent to Solr

2012-04-17 Thread Stephen Duncan Jr (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255472#comment-13255472
 ] 

Stephen Duncan Jr commented on SOLR-1976:
-

I have not been able to reproduce this with newer versions of Solr.

 Commit on StreamingUpdateSolrServer can happen before all previously added 
 docs have been sent to Solr
 --

 Key: SOLR-1976
 URL: https://issues.apache.org/jira/browse/SOLR-1976
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4.1
Reporter: Stephen Duncan Jr
Priority: Minor

 Because of it's multi-threaded nature, calling commit on 
 StreamingUpdateSolrServer  can send the commit before all the added documents 
 have been sent to Solr.  Calling blockUntilFinished() does not change this.  
 It needs to be possible to send a commit that will commit all the documents 
 that have been added previously.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-1976) Commit on StreamingUpdateSolrServer can happen before all previously added docs have been sent to Solr


 [ 
https://issues.apache.org/jira/browse/SOLR-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-1976.
--

Resolution: Cannot Reproduce

Thanks for getting back. I'll resolve this issue then.

 Commit on StreamingUpdateSolrServer can happen before all previously added 
 docs have been sent to Solr
 --

 Key: SOLR-1976
 URL: https://issues.apache.org/jira/browse/SOLR-1976
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4.1
Reporter: Stephen Duncan Jr
Priority: Minor

 Because of it's multi-threaded nature, calling commit on 
 StreamingUpdateSolrServer  can send the commit before all the added documents 
 have been sent to Solr.  Calling blockUntilFinished() does not change this.  
 It needs to be possible to send a commit that will commit all the documents 
 that have been added previously.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2755) StreamingUpdateSolrServer is hard-coded to write XML data. It should integrate the RequestWriter API so that it can be used to send binary update payloads.


[ 
https://issues.apache.org/jira/browse/SOLR-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255486#comment-13255486
 ] 

Sami Siren commented on SOLR-2755:
--

SOLR-1565 added support for javabin. 

Patrick: Is there something that this solution adds that was not part of 
SOLR-1565?

 StreamingUpdateSolrServer is hard-coded to write XML data. It should 
 integrate the RequestWriter API so that it can be used to send binary update 
 payloads.
 ---

 Key: SOLR-2755
 URL: https://issues.apache.org/jira/browse/SOLR-2755
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 3.3
Reporter: Patrick Sauts
  Labels: patch
 Fix For: 4.0

 Attachments: patch-StreamingUpdateSolrServer.txt


 The aim of this patch is to use the RequestWriter API with 
 StreamingUpdateSolrServer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3965) consolidate all api modules in one place and un!@$# packaging for 4.0


[ 
https://issues.apache.org/jira/browse/LUCENE-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255489#comment-13255489
 ] 

Robert Muir commented on LUCENE-3965:
-

I plan to commit this later today.

 consolidate all api modules in one place and un!@$# packaging for 4.0
 -

 Key: LUCENE-3965
 URL: https://issues.apache.org/jira/browse/LUCENE-3965
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3965.patch, LUCENE-3965.patch, LUCENE-3965.patch, 
 LUCENE-3965.patch, LUCENE-3965_module_build.patch, 
 LUCENE-3965_module_build_pname.patch


 I think users get confused about how svn/source is structured,
 when in fact we are just producing a modular build.
 I think it would be more clear if the lucene stuff was underneath
 modules/, thats where our modular API is.
 we could still package this up as lucene.tar.gz if we want, and even name
 modules/core lucene-core.jar, but i think this would be a lot better
 organized than the current:
 * lucene
 * lucene/contrib
 * modules
 confusion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2300) snapinstaller on slave is failing

[
https://issues.apache.org/jira/browse/SOLR-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sami Siren updated SOLR-2300:
-

Component/s: (was: clients - java)
replication (scripts)

This setup is giving issue only in linux. Is this known bug on linux?

I think the deletion problem is a combination of using nfs and solr (or some
other process) holding some file open.

I realize this issue is pretty old but have you tried with more recent version
of Solr, does the problem still exist?

snapinstaller on slave is failing
-

Key: SOLR-2300
URL: https://issues.apache.org/jira/browse/SOLR-2300
Project: Solr
Issue Type: Bug
Components: replication (scripts)
Affects Versions: 1.3
Environment: Linux, Jboss 5.0GA, solr 1.3.0
Reporter: sakunthala padmanabhuni

Hi,
We are using Solr on Mac OSX and it is working fine. Same setup we have
moved to Linux. We have master, slave setup. Every 5 minutes, index will be
replicated from Master to Slave and will be installed on slave. But on Linux
on the slave when the snapinstaller script is called, it is failing and
showing below error in logs.
/bin/rm: cannot remove
`/ngs/app/esearcht/Slave2index/data/index/.nfs000111030749':
Device or resource busy
This error is occuring in snapinstaller script at below lines.
cp -lr ${name}/ ${data_dir}/index.tmp$$ \
/bin/rm -rf ${data_dir}/index \
mv -f ${data_dir}/index.tmp$$ ${data_dir}/index
It is not able to remove the index folder. So the index.tmp files are keep
growing in the data directory.
Our data directory is /ngs/app/esearcht/Slave2index/data. When checked
with ls -al in the index directory, there are some .nfs files still there,
which are not letting index directory to be deleted. And these .nfs files
are still being used by SOLR in jboss.
This setup is giving issue only in linux. Is this known bug on linux?

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3174) Visualize Cluster State

2012-04-17 Thread Erick Erickson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255492#comment-13255492
 ] 

Erick Erickson commented on SOLR-3174:
--

Cool, applied cleanly.

Thanks

 Visualize Cluster State
 ---

 Key: SOLR-3174
 URL: https://issues.apache.org/jira/browse/SOLR-3174
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Reporter: Ryan McKinley
Assignee: Stefan Matheis (steffkes)
 Attachments: SOLR-3174-graph.png, SOLR-3174-graph.png, 
 SOLR-3174-rgraph.png, SOLR-3174-rgraph.png, SOLR-3174.patch, SOLR-3174.patch, 
 SOLR-3174.patch, SOLR-3174.patch


 It would be great to visualize the cluster state in the new UI. 
 See Mark's wish:
 https://issues.apache.org/jira/browse/SOLR-3162?focusedCommentId=13218272page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13218272

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3341) CloudSolrServer javadoc needs help


 [ 
https://issues.apache.org/jira/browse/SOLR-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-3341.
--

   Resolution: Fixed
Fix Version/s: 4.0
 Assignee: Sami Siren

Doc added, thanks Benson.

 CloudSolrServer javadoc needs help 
 ---

 Key: SOLR-3341
 URL: https://issues.apache.org/jira/browse/SOLR-3341
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
Reporter: Benson Margulies
Assignee: Sami Siren
 Fix For: 4.0

 Attachments: 0001-Add-more-javadoc-to-CloudSolrServer.patch


 The zkHost parameter needs some explanation. It's actually a host:port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2873) StreamingUpdateSolrServer does not provide public access to shutdown ExecutorService scheduler


 [ 
https://issues.apache.org/jira/browse/SOLR-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-2873.
--

Resolution: Invalid

Today there is a shutdown method that calls scheduler.shutdown(), perhaps it 
was added in some other issue.

 StreamingUpdateSolrServer does not provide public access to shutdown 
 ExecutorService scheduler
 --

 Key: SOLR-2873
 URL: https://issues.apache.org/jira/browse/SOLR-2873
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.0
 Environment: N/A
Reporter: Michael Gibney
Priority: Minor
 Fix For: 4.0

   Original Estimate: 5m
  Remaining Estimate: 5m

 Applications do not exit until the StreamingUpdateSolrServer ExecutorService 
 threads have died.  Currently, with no way to manually shut down the 
 ExecutorService, an application that has completed execution will hang for 
 60s waiting for the keepAlive time on the pooled runner threads to expire.  
 This could be addressed by adding a single method to 
 StreamingUpdateSolrServer:
 {code:borderStyle=solid}
 public void shutdown() {
 scheduler.shutdown();
 }
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-1812) StreamingUpdateSolrServer creates an OutputStreamWriter that it never closes


 [ 
https://issues.apache.org/jira/browse/SOLR-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-1812.
--

   Resolution: Invalid
Fix Version/s: (was: 4.0)

Since this issue was created the internals have changed so that 
OutputStreamWriter is no longer used. I am reolving this.

 StreamingUpdateSolrServer creates an OutputStreamWriter that it never closes
 

 Key: SOLR-1812
 URL: https://issues.apache.org/jira/browse/SOLR-1812
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 1.4
Reporter: Mark Miller
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: why the of advance(int target) function of DocIdSetIterator is defined with uncertain?

2012-04-17 Thread Mikhail Khludnev

Hello,

I can't help with the particular question, but can share some experience.
My task is roughly the same I've found the patch
https://issues.apache.org/jira/browse/LUCENE-2686 is absolutely useful
(with one small addition, I'll post it in comments soon). By using it I
have disjunction summing query with steady subscorers.

Regards

On Tue, Apr 17, 2012 at 2:37 PM, Li Li fancye...@gmail.com wrote:

 hi all,
 I am now hacking the BooleanScorer2 to let it keep the docID() of the
 leaf scorer(mostly possible TermScorer) the same as the top-level Scorer.
 Why I want to do this is: When I Collect a doc, I want to know which term
 is matched(especially for BooleanClause whose Occur is SHOULD). we have
 discussed some solutions, such as adding bit masks in disjunction scorers.
 with this method, when we finds a matched doc, we can recursively find
 which leaf scorer is matched. But we think it's not very efficient and not
 convenient to use(this is my proposal but not agreed by others in our
 team). and then we came up with another one: Modifying DisjunctionSumScorer.
we analysed the codes and found that the only Scorers used by
 BooleanScorer2 that will make the children scorers' docID() not equal to
 parent is an anonymous class inherited from DisjunctionSumScorer. All other
 ones including SingleMatchScorer, countingConjunctionSumScorer(anonymous),
 dualConjuctionSumScorer, ReqOptSumScorer and ReqExclScorer are fit our need.
The implementation algorithm of DisjunctionSumScorer use a heap to find
 the smallest doc. after finding a matched doc, the currentDoc is the
 matched doc and all the scorers in the heap will call nextDoc() so all of
 the scorers' current docID the nextDoc of currentDoc. if there are N level
 DisjunctionSumScorer, the leaf scorer's current doc is the n-th next docId
 of the root of the scorer tree.
So we modify the DisjuctionSumScorer and let it behavior as we
 expected. And then I wrote some TestCase and it works well. And also I
 wrote some random generated TermScorer and compared the nextDoc(),score()
 and advance(int) method of original DisjunctionSumScorer and modified one.
 nextDoc() and score() and exactly the same. But for advance(int target), we
 found some interesting and strange things.
at the beginning, I think if target is less than current docID, it will
 just return current docID and do nothing. this assumption let my algorithm
 go wrong. Then I read the codes of TermScorer and found each call of
 advance(int) of TermScorer will call nextDoc() no matter whether current
 docID is larger than target or not.
So I am confused and then read the javadoc of DocIdSetIterator:
 - javadoc of DocIdSetIterator.advance(int
 target)-

 int org.apache.lucene.search.DocIdSetIterator.advance(int target) throws
 IOException

 Advances to the first beyond (see NOTE below) the current whose document
 number is greater than or equal
  to target. Returns the current document number or NO_MORE_DOCS if there
 are no more docs in the set.
 Behaves as if written:
  int advance(int target) {
int doc;
while ((doc = nextDoc())  target) {
}
return doc;
  }
  Some implementations are considerably more efficient than that.
 NOTE: when target  current implementations may opt not to advance beyond
 their current docID().
 NOTE: this method may be called with NO_MORE_DOCS for efficiency by some
 Scorers. If your
  implementation cannot efficiently determine that it should exhaust, it is
 recommended that you check for
  that value in each call to this method.
 NOTE: after the iterator has exhausted you should not call this method, as
 it may result in unpredicted
  behavior.
 --
 Then I modified my algorithm again and found that
 DisjunctionSumScorer.advance(int target) has some strange behavior. most of
 the cases, it will return currentDoc if target  currentDoc. but in some
 boundary condition, it will not.
 it's not a bug but let me sad. I thought my algorithm has some bug because
 it's advance method is not exactly the same as original
 DisjunctionSumScorer's.
 codes of DisjunctionSumScorer---
   @Override
   public int advance(int target) throws IOException {
 if (scorerDocQueue.size()  minimumNrMatchers) {
   return currentDoc = NO_MORE_DOCS;
 }
 if (target = currentDoc) {
   return currentDoc;
 }

 ---
 for most case if (target = currentDoc) it will return currentDoc;
 but if previous advance will make sub scorers exhausted, then if may
 return NO_MORE_DOCS
 an example is:
currentDoc=-1
minimumNrMatchers=1
subScorers:
   TermScorer: docIds: [1, 2, 6]
   TermScorer: docIds: [2, 4]
 after first call advance(5)
 currentDoc=6
 only first scorer is now in the heap, scorerDocQueue.size()==1
 then call advance(6)
 because scorerDocQueue.size()  minimumNrMatchers, it just return
 NO_MORE_DOCS

 My question is why the

[jira] [Commented] (LUCENE-2686) DisjunctionSumScorer should not call .score on sub scorers until consumer calls .score

2012-04-17 Thread Mikhail Khludnev (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255507#comment-13255507
 ] 

Mikhail Khludnev commented on LUCENE-2686:
--

Hello,

I used this patch not by applying directly, but introducing ShouldQuery in my 
codebase which extends BQ and provides steadychild scorers for disjunction. 
It works great, but one tests are spinning in infinite loop. My amendment 
breaks possible infinite loop in constant query scorer:
{code}

ConstantScoreQuery.ConstantScorer.score()

@Override
protected boolean score(Collector collector, int max, int firstDocID) 
throws IOException {
if (docIdSetIterator instanceof Scorer) {
final boolean score = ((Scorer) 
docIdSetIterator).score(wrapCollector(collector), max, firstDocID);

 // let's break the loop
 final boolean result = score  ( docIdSetIterator.docID() != 
NO_MORE_DOCS);
return result;
} else {
return super.score(collector, max, firstDocID);
}
}
{code}


 DisjunctionSumScorer should not call .score on sub scorers until consumer 
 calls .score
 --

 Key: LUCENE-2686
 URL: https://issues.apache.org/jira/browse/LUCENE-2686
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-2686.patch, LUCENE-2686.patch, 
 Test2LUCENE2590.java


 Spinoff from java-user thread question about Scorer.freq() from Koji...
 BooleanScorer2 uses DisjunctionSumScorer to score only-SHOULD-clause boolean 
 queries.
 But, this scorer does too much work for collectors that never call .score, 
 because it scores while it's matching.  It should only call .score on the 
 subs when the caller calls its .score.
 This also has the side effect of messing up advanced collectors that gather 
 the freq() of the subs (using LUCENE-2590).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3219) StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is


[ 
https://issues.apache.org/jira/browse/SOLR-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255509#comment-13255509
 ] 

Sami Siren commented on SOLR-3219:
--

bq. Can you tell me how to turn off SUSS logging?

Questions are best asked in the mailing lists. When using the JUL binding one 
can configure the required level for messages for a class like this:

classname=LEVEL

so for example:

org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner=SEVERE


 StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is
 

 Key: SOLR-3219
 URL: https://issues.apache.org/jira/browse/SOLR-3219
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 3.5, 4.0
Reporter: Shawn Heisey
Priority: Minor

 When using CommonsHttpSolrServer, nothing gets logged by SolrJ at the INFO 
 level.  When using StreamingUpdateSolrServer, I have seen two messages logged 
 each time it is used:
 Mar 08, 2012 4:41:01 PM 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run
 INFO: starting runner: 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508
 Mar 08, 2012 4:41:01 PM 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run
 INFO: finished: 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508
 I think one of these behaviors should be considered a bug.  My preference is 
 to move the logging in SUSS out of INFO so it is silent like CHSS.  If the 
 decision is to leave it at INFO, I'll just live with it.  A knob to make it 
 configurable would be cool, but that's probably a fair amount of work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3219) StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is

2012-04-17 Thread Sami Siren (Issue Comment Edited) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren resolved SOLR-3219.
--

Resolution: Won't Fix

 StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is
 

 Key: SOLR-3219
 URL: https://issues.apache.org/jira/browse/SOLR-3219
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 3.5, 4.0
Reporter: Shawn Heisey
Priority: Minor

 When using CommonsHttpSolrServer, nothing gets logged by SolrJ at the INFO 
 level.  When using StreamingUpdateSolrServer, I have seen two messages logged 
 each time it is used:
 Mar 08, 2012 4:41:01 PM 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run
 INFO: starting runner: 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508
 Mar 08, 2012 4:41:01 PM 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run
 INFO: finished: 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508
 I think one of these behaviors should be considered a bug.  My preference is 
 to move the logging in SUSS out of INFO so it is silent like CHSS.  If the 
 decision is to leave it at INFO, I'll just live with it.  A knob to make it 
 configurable would be cool, but that's probably a fair amount of work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3219) StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is


[ 
https://issues.apache.org/jira/browse/SOLR-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255509#comment-13255509
 ] 

Sami Siren edited comment on SOLR-3219 at 4/17/12 12:40 PM:


bq. Can you tell me how to turn off SUSS logging?

Questions are best asked in the mailing lists. When using the JUL binding one 
can configure the required level for messages for a class like this:

classname.level=LEVEL

so for example:

org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer.level=SEVERE


  was (Author: siren):
bq. Can you tell me how to turn off SUSS logging?

Questions are best asked in the mailing lists. When using the JUL binding one 
can configure the required level for messages for a class like this:

classname=LEVEL

so for example:

org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner=SEVERE

  
 StreamingUpdateSolrServer is not quiet at INFO, but CommonsHttpSolrServer is
 

 Key: SOLR-3219
 URL: https://issues.apache.org/jira/browse/SOLR-3219
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 3.5, 4.0
Reporter: Shawn Heisey
Priority: Minor

 When using CommonsHttpSolrServer, nothing gets logged by SolrJ at the INFO 
 level.  When using StreamingUpdateSolrServer, I have seen two messages logged 
 each time it is used:
 Mar 08, 2012 4:41:01 PM 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run
 INFO: starting runner: 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508
 Mar 08, 2012 4:41:01 PM 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner run
 INFO: finished: 
 org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner@6bf28508
 I think one of these behaviors should be considered a bug.  My preference is 
 to move the logging in SUSS out of INFO so it is silent like CHSS.  If the 
 decision is to leave it at INFO, I'll just live with it.  A knob to make it 
 configurable would be cool, but that's probably a fair amount of work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2898) Support grouped faceting

2012-04-17 Thread Bjorn Hijmans (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255514#comment-13255514
 ] 

Bjorn Hijmans commented on SOLR-2898:
-

Hi Martijn, any idea when range and query facets will be added?

 Support grouped faceting
 

 Key: SOLR-2898
 URL: https://issues.apache.org/jira/browse/SOLR-2898
 Project: Solr
  Issue Type: New Feature
Reporter: Martijn van Groningen
 Fix For: 4.0

 Attachments: SOLR-2898.patch, SOLR-2898.patch, SOLR-2898.patch, 
 SOLR-2898.patch


 Support grouped faceting. As described in LUCENE-3097 (matrix counts).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3965) consolidate all api modules in one place and un!@$# packaging for 4.0

2012-04-17 Thread Steven Rowe (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255520#comment-13255520
 ] 

Steven Rowe commented on LUCENE-3965:
-

+1 to commit - good progress!

Tests work from the top level, and I tried {{ant test}} in a couple of modules' 
directories, which also worked.  Build output all seems to be going to the 
right place (under {{lucene/build/}}).

I scanned the changed build files, and I didn't see any problems.

I searched {{\*build.xml}} for modules/ and contrib.  modules/ seems to 
be gone, but there are several names that still have contrib in them (e.g. 
{{test-contrib}}) in {{lucene/build.xml}} and {{lucene/common-build.xml}}.  
These names can be fixed later.

I didn't look at javadocs or packaging - I assume anything you've done there 
will be better than it was :).

 consolidate all api modules in one place and un!@$# packaging for 4.0
 -

 Key: LUCENE-3965
 URL: https://issues.apache.org/jira/browse/LUCENE-3965
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3965.patch, LUCENE-3965.patch, LUCENE-3965.patch, 
 LUCENE-3965.patch, LUCENE-3965_module_build.patch, 
 LUCENE-3965_module_build_pname.patch


 I think users get confused about how svn/source is structured,
 when in fact we are just producing a modular build.
 I think it would be more clear if the lucene stuff was underneath
 modules/, thats where our modular API is.
 we could still package this up as lucene.tar.gz if we want, and even name
 modules/core lucene-core.jar, but i think this would be a lot better
 organized than the current:
 * lucene
 * lucene/contrib
 * modules
 confusion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 13221 - Failure

Is this jenkins going insane?

ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to exception
hudson.remoting.RequestAbortedException:
hudson.remoting.RequestAbortedException: java.io.IOException:
Unexpected termination of the channel
at hudson.remoting.Request.call(Request.java:149)
at hudson.remoting.Channel.call(Channel.java:646)
at hudson.FilePath.act(FilePath.java:821)
at hudson.FilePath.act(FilePath.java:814)
at hudson.tasks.junit.JUnitParser.parse(JUnitParser.java:83)
at 
hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:122)
at 
hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:134)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at 
hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:705)
at 
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:680)
at 
hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuild.java:658)
at hudson.model.Build$RunnerImpl.post2(Build.java:162)
at 
hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:627)
at hudson.model.Run.run(Run.java:1438)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:238)
Caused by: hudson.remoting.RequestAbortedException:
java.io.IOException: Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:273)
at hudson.remoting.Channel.terminate(Channel.java:702)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.io.IOException: Unexpected termination of the channel
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2571)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
at hudson.remoting.Command.readFrom(Command.java:90)
at 
hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:59)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
Email was triggered for: Failure
Sending email for trigger: Failure
ERROR: Error: No workspace found!


On Tue, Apr 17, 2012 at 1:51 PM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13221/

 No tests ran.

 Build Log (for compile errors):
 [...truncated 20703 lines...]




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 13221 - Failure

2012-04-17 Thread Uwe Schindler

Happens sometimes.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of
 Dawid Weiss
 Sent: Tuesday, April 17, 2012 2:57 PM
 To: dev@lucene.apache.org
 Subject: Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 13221 - Failure
 
 Is this jenkins going insane?
 
 ERROR: Publisher hudson.tasks.junit.JUnitResultArchiver aborted due to
 exception
 hudson.remoting.RequestAbortedException:
 hudson.remoting.RequestAbortedException: java.io.IOException:
 Unexpected termination of the channel
   at hudson.remoting.Request.call(Request.java:149)
   at hudson.remoting.Channel.call(Channel.java:646)
   at hudson.FilePath.act(FilePath.java:821)
   at hudson.FilePath.act(FilePath.java:814)
   at hudson.tasks.junit.JUnitParser.parse(JUnitParser.java:83)
   at
 hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:122)
   at
 hudson.tasks.junit.JUnitResultArchiver.perform(JUnitResultArchiver.java:134)
   at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
   at
 hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:705)
   at
 hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuil
 d.java:680)
   at
 hudson.model.AbstractBuild$AbstractRunner.performAllBuildSteps(AbstractBuil
 d.java:658)
   at hudson.model.Build$RunnerImpl.post2(Build.java:162)
   at
 hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:627)
   at hudson.model.Run.run(Run.java:1438)
   at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
   at
 hudson.model.ResourceController.execute(ResourceController.java:88)
   at hudson.model.Executor.run(Executor.java:238)
 Caused by: hudson.remoting.RequestAbortedException:
 java.io.IOException: Unexpected termination of the channel
   at hudson.remoting.Request.abort(Request.java:273)
   at hudson.remoting.Channel.terminate(Channel.java:702)
   at
 hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchron
 ousCommandTransport.java:69)
 Caused by: java.io.IOException: Unexpected termination of the channel
   at
 hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchron
 ousCommandTransport.java:50)
 Caused by: java.io.EOFException
   at
 java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream
 .java:2571)
   at
 java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1315)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
   at hudson.remoting.Command.readFrom(Command.java:90)
   at
 hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.ja
 va:59)
   at
 hudson.remoting.SynchronousCommandTransport$ReaderThread.run(Synchron
 ousCommandTransport.java:48)
 Email was triggered for: Failure
 Sending email for trigger: Failure
 ERROR: Error: No workspace found!
 
 
 On Tue, Apr 17, 2012 at 1:51 PM, Apache Jenkins Server
 jenk...@builds.apache.org wrote:
  Build:
  https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/13221/
 
  No tests ran.
 
  Build Log (for compile errors):
  [...truncated 20703 lines...]
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-899) NullPointerException in ClientUtils.writeXML on NULL field value

[
https://issues.apache.org/jira/browse/SOLR-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255539#comment-13255539
]

Sami Siren commented on SOLR-899:
-

This does not seem to be a problem with trunk version of solrj: I can add
fields with null values and they are silently being ignored when the document
is serialized without NPEs being thrown.

NullPointerException in ClientUtils.writeXML on NULL field value

Key: SOLR-899
URL: https://issues.apache.org/jira/browse/SOLR-899
Project: Solr
Issue Type: Bug
Components: clients - java
Affects Versions: 1.3
Reporter: Todd Feak
Priority: Minor

This exception occurs if I have a field in a document with a null value.
java.lang.NullPointerException
at
org.apache.solr.client.solrj.util.ClientUtils.writeXML(ClientUtils.java:117)
at
org.apache.solr.client.solrj.request.UpdateRequest.getXML(UpdateRequest.java:169)
at
org.apache.solr.client.solrj.request.UpdateRequest.getContentStreams(UpdateRequest.java:160)
...
Previous versions of this class had a null check, which was subsequently
removed. I have no problem with removing the previous null-check, as it
seemed to hide a programming mistake (i.e. null values). However, I think
that the exception that occurs here could at least be a bit more informative.
Performing a null check and then throwing some sort of RuntimeException or
IOException with a descriptive message would be very helpful. Such as
Failure, NULL value for field named[foo] detected.
Alternatively, I think that an argument could be made that this NULL
shouldn't have been allowed in the document in the first place. If that is
the case, then NULL checks with similarly helpful messages could be performed
upstream of this issue. I personally lean this way, as I prefer to find a
programming mistake closer to the source of the issue. It allows me to find
out exactly where the NULL was inserted in the first place.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3965) consolidate all api modules in one place and un!@$# packaging for 4.0


[ 
https://issues.apache.org/jira/browse/LUCENE-3965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1329#comment-1329
 ] 

Robert Muir commented on LUCENE-3965:
-

running a final test first. I committed the fixes to nightly/

However, we could encounter a failed build from the svn check, due to the 
removal
of the bogus build directories and their svn:ignores (would just be leftover 
relics).

Once i commit I will ask Uwe to clean the workspaces to (hopefully) prevent 
that,
but its possible one build could slip through... (and of course the possibility
i have some other bugs)

 consolidate all api modules in one place and un!@$# packaging for 4.0
 -

 Key: LUCENE-3965
 URL: https://issues.apache.org/jira/browse/LUCENE-3965
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3965.patch, LUCENE-3965.patch, LUCENE-3965.patch, 
 LUCENE-3965.patch, LUCENE-3965_module_build.patch, 
 LUCENE-3965_module_build_pname.patch


 I think users get confused about how svn/source is structured,
 when in fact we are just producing a modular build.
 I think it would be more clear if the lucene stuff was underneath
 modules/, thats where our modular API is.
 we could still package this up as lucene.tar.gz if we want, and even name
 modules/core lucene-core.jar, but i think this would be a lot better
 organized than the current:
 * lucene
 * lucene/contrib
 * modules
 confusion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Documentation of boolean FunctionQueries

2012-04-17 Thread Jan Høydahl

Hi,

I've given a stab on documenting the boolean Functions which Yonik added in 
SOLR-2136

  http://wiki.apache.org/solr/FunctionQuery#Boolean_Functions

I'm not a wizard on these, maybe I've forgotten some as well. Peer review 
appreciated.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3069) add ability to commit w/o opening a new searcher

2012-04-17 Thread Mark Miller (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255579#comment-13255579
 ] 

Mark Miller commented on SOLR-3069:
---

This is in and can be resolved right?

 add ability to commit w/o opening a new searcher
 

 Key: SOLR-3069
 URL: https://issues.apache.org/jira/browse/SOLR-3069
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
Assignee: Yonik Seeley
 Fix For: 4.0

 Attachments: SOLR-3069.patch, SOLR-3069.patch


 Solr should have the ability to commit without opening a new searcher.  This 
 becomes even more useful now that we have softCommit to open a new NRT 
 searcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3069) add ability to commit w/o opening a new searcher

2012-04-17 Thread Yonik Seeley (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-3069.


Resolution: Fixed

 add ability to commit w/o opening a new searcher
 

 Key: SOLR-3069
 URL: https://issues.apache.org/jira/browse/SOLR-3069
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
Assignee: Yonik Seeley
 Fix For: 4.0

 Attachments: SOLR-3069.patch, SOLR-3069.patch


 Solr should have the ability to commit without opening a new searcher.  This 
 becomes even more useful now that we have softCommit to open a new NRT 
 searcher.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions

2012-04-17 Thread Shawn Heisey (Issue Comment Edited) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255602#comment-13255602
]

Shawn Heisey commented on SOLR-3284:

If the Solr server goes down in between updates done with the concurrent
server, doing further updates will fail, but the calling code will not know
that. With the Commons or Http server, an exception is thrown that my code
catches.

I don't think that just overriding handleError is enough. If Solr goes down
but the machine is still up, you have immediate failure detection because the
connection will be refused. If the server goes away entirely, it could take a
couple of minutes to fail. You would have to provide methods to check that 1)
all background operations are complete and 2) they were error free.

I can no longer remember whether an exception is thrown when trying a commit
against a down machine with the concurrent server. IIRC it does throw one in
this instance. I definitely believe that it should. Perhaps the current
handleError code could update class-level members (with names like boolean
updateErrored and SolrServerException updateException) that could be checked
and used by the commit method. If they are set, it would reset them and throw
an exception (fast-fail) without actually trying the commit. There should
probably be a constructor option and a set method to either activate this new
behavior or restore the original behavior.

When I first designed my code, I was relying on the exceptions thrown by the
commons server when doing the actual update, so it's too late by the time it
reaches the commit - it has already updated the position values. I now realize
that this is incorrect design, though I might never have figured it out without
my attempt to use the concurrent server. It's going to be a bit painful to
redesign my code to put off updating position values until after a successful
commit operation. It's something I do intend to do.

StreamingUpdateSolrServer swallows exceptions
-

Key: SOLR-3284
URL: https://issues.apache.org/jira/browse/SOLR-3284
Project: Solr
Issue Type: Improvement
Components: clients - java
Affects Versions: 3.5, 4.0
Reporter: Shawn Heisey

StreamingUpdateSolrServer eats exceptions thrown by lower level code, such as
HttpClient, when doing adds. It may happen with other methods, though I know
that query and deleteByQuery will throw exceptions. I believe that this is a
result of the queue/Runner design. That's what makes SUSS perform better,
but it means you sacrifice the ability to programmatically determine that
there was a problem with your update. All errors are logged via slf4j, but
that's not terribly helpful except with determining what went wrong after the
fact.
When using CommonsHttpSolrServer, I've been able to rely on getting an
exception thrown by pretty much any error, letting me use try/catch to detect
problems.
There's probably enough dependent code out there that it would not be a good
idea to change the design of SUSS, unless there were alternate constructors
or additional methods available to configure new/old behavior. Fixing this
is probably not trivial, so it's probably a better idea to come up with a new
server object based on CHSS. This is outside my current skillset.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Robert Muir (Created) (JIRA)

some nightly tests take hours
-

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir


The nightly builds are taking 4-7 hours.

This is caused by a few bad apples (can be seen 
https://builds.apache.org/job/Lucene-trunk/1896/testReport/).

The top 5 are (all in analysis):

* TestSynonymMapFilter: 1 hr 54 min
* TestRandomChains: 1 hr 22 min
* TestRemoveDuplicatesTokenFilter: 32 min
* TestMappingCharFilter: 28 min
* TestWordDelimiterFilter: 22 min

so thats 4.5 hours right there for that run


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3362) FacetComponent throws NPE when doing distributed query

2012-04-17 Thread Yonik Seeley (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255639#comment-13255639
 ] 

Yonik Seeley commented on SOLR-3362:


So it seems like we're getting back a term we didn't ask for.
One way this could happen is if the encoding is messed up - StrUtils.join and 
splitSmart are used for this, and I don't see an obvious error there.

At this point perhaps we should add a check at line 489 and log terms that come 
back that we don't ask for.

 FacetComponent throws NPE when doing distributed query
 --

 Key: SOLR-3362
 URL: https://issues.apache.org/jira/browse/SOLR-3362
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: RHEL 
 lucene svn revision 1308309
Reporter: Jamie Johnson

 When executing a query against a field in my index I am getting the following 
 exception
 The query I am executing is as follows:
 http://host:port/solr/collection1/select?q=bobfacet=truefacet.field=organization
 debugging the FacetComponent line 489 sfc is null
 SEVERE: java.lang.NullPointerException
at 
 org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:489)
at 
 org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:278)
at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1550)
at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
at 
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2231) DataImportHandler - MultiThreaded - Logging

2012-04-17 Thread James Dyer (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Dyer resolved SOLR-2231.
--

Resolution: Won't Fix
  Assignee: James Dyer

Multi-Threading was removed from DIH in 4.0

 DataImportHandler - MultiThreaded - Logging
 ---

 Key: SOLR-2231
 URL: https://issues.apache.org/jira/browse/SOLR-2231
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.5
Reporter: Fuad Efendi
Assignee: James Dyer
Priority: Trivial

 Please use
 {code}
 if (LOG.isInfoEnabled()) LOG.info(...)
 {code}
 For instance, line 95 of ThreadedEntityProcessorWrapper creates huge log 
 output which is impossible to manage via logging properties:
 LOG.info(arow : +arow);
 This line (in a loop) will output results of all SQL from a database (and 
 will slow down SOLR performance). It's even better to use LOG.debug instead 
 of LOG.info, INFO is enabled by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions

2012-04-17 Thread Shawn Heisey (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shawn Heisey updated SOLR-3284:
---

Attachment: SOLR-3284.patch

First crack at a patch for throwing delayed exceptions. It should do this on
any request when a previous request resulted in an error, not just on commits.

StreamingUpdateSolrServer swallows exceptions
-

Key: SOLR-3284
URL: https://issues.apache.org/jira/browse/SOLR-3284
Project: Solr
Issue Type: Improvement
Components: clients - java
Affects Versions: 3.5, 4.0
Reporter: Shawn Heisey
Attachments: SOLR-3284.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions

[
https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255657#comment-13255657
]

Shawn Heisey edited comment on SOLR-3284 at 4/17/12 3:45 PM:
-

First crack at a patch for throwing delayed exceptions. It should do this on
any request when a previous request resulted in an error, not just on commits.
I did not attempt to make any unit tests. I'm not entirely sure how unit tests
work when things are supposed to succeed, how to simulate a failure is even
less obvious.

was (Author: elyograg):
First crack at a patch for throwing delayed exceptions. It should do this
on any request when a previous request resulted in an error, not just on
commits.

StreamingUpdateSolrServer swallows exceptions
-

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-1867) CachedSQLentity processor is using unbounded hashmap

2012-04-17 Thread James Dyer (Resolved) (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Dyer resolved SOLR-1867.
--

Resolution: Won't Fix

SOLR-2382 allowed for plugable cache implementations (3.6 4.x). There are 2
(uncommitted thus far) cache implementations that store the cache on-disk (See
SOLR-2613 and SOLR-2948).

Finally, if the memory leak described was due to ThreadLocal usage, this was
eliminated in 4.0 with the removal of the threads feature.

CachedSQLentity processor is using unbounded hashmap
-

Key: SOLR-1867
URL: https://issues.apache.org/jira/browse/SOLR-1867
Project: Solr
Issue Type: Bug
Components: contrib - DataImportHandler
Affects Versions: 1.4
Reporter: barani

I am using cachedSqlEntityprocessor in DIH to index the data. Please find a
sample dataconfig structure,
entity x query=select * from x --- object
entity y query=select * from y processor=cachedSqlEntityprocessor
cachekey=y.id cachevalue=x.id -- object properties
For each and every object I would be retrieveing corresponding object
properties (in my subqueries).
I get in to OOM very often and I think thats a trade off if I use
cachedSqlEntityprocessor.
My assumption is that when I use cachedSqlEntityprocessor the indexing
happens as follows,
First entity x will get executed and the entire table gets stored in cache
next entity y gets executed and entire table gets stored in cache
Finally the compasion heppens through hash map .
So always I need to have the memory allocated to SOLR JVM more than or equal
to the data present in tables.
One more issue is that even after SOLR completes indexing, the memory used
previously is not getting released. I could still see the JVM consuming 1.5
GB after the indexing completes. I tried to use Java hotspot options but
didnt see any differences.. GC is not getting invoked even after a long time
when using CachedSQLentity processor
Main issue seem to be the fact that the CachedSQLentity processor cache is
an unbounded HashMap, with no option to bound it.
Reference:
http://n3.nabble.com/Need-info-on-CachedSQLentity-processor-tt698418.html#a698418

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Problem running all of a module's tests under IntelliJ: Wrong test finished.

2012-04-17 Thread Steven A Rowe

Hi Dawid :)

Do you use IntelliJ?  There appears to be some form of bad interaction between 
the new RandomizedTesting library additions and IntelliJ's test runner.

When I try to run all of an IntelliJ module's tests under IntelliJ, e.g. 
analyzers-common or lucene (including core and test-framework), not all tests 
run; those that don't run are reported as not started.  The external test 
process reports Wrong test finished. (???) and then returns exit code -1.

This behavior is relatively new - I don't think the modules/*-lucene/ move is 
the culprit (the IntelliJ lucene+test-framework module didn't move and it has 
this issue).
 
Here's the output from running all analyzers-common tests:

--
C:\Program Files\Java\jdk1.6.0_21\bin\java -ea -DtempDir=temp 
-Didea.launcher.port=7541 -Didea.launcher.bin.path=C:\Program Files 
(x86)\JetBrains\IntelliJ IDEA 11.1\bin -Dfile.encoding=UTF-8 -classpath 
C:\Program Files (x86)\JetBrains\IntelliJ IDEA 11.1\lib\idea_rt.jar;C:\Program 
Files (x86)\JetBrains\IntelliJ IDEA 
11.1\plugins\junit\lib\junit-rt.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\alt-rt.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\charsets.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\deploy.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\javaws.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\jce.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\jsse.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\management-agent.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\plugin.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\resources.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\rt.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\ext\dnsns.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\ext\localedata.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\ext\sunjce_provider.jar;C:\svn\lucene\dev\trunk\lucene\build\analysis\analyzers-common\classes\test;C:\svn\lucene\dev\trunk\lucene\build\analysis\analyzers-common\classes\java;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\junit-4.10.jar;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\randomizedtesting-runner-1.2.0.jar;C:\svn\lucene\dev\trunk\lucene\build\lucene-idea\classes\test;C:\svn\lucene\dev\trunk\lucene\build\lucene-idea\classes\java;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\ant-1.7.1.jar;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\ant-junit-1.7.1.jar
 com.intellij.rt.execution.application.AppMain 
com.intellij.rt.execution.junit.JUnitStarter -ideVersion5 
@C:\Users\sarowe\AppData\Local\Temp\idea_junit3377604973713774012.tmp 
-socket53790

Test '.default package.WordBreakTestUnicode_6_0_0' ignored
Test 
'org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testNastyPattern'
 ignored

Wrong test finished. Last started: [] stopped: 
testNastyPattern(org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter);
 class org.junit.runner.Description

Process finished with exit code -1
--


Steve

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

unsubscribe

2012-04-17 Thread olivier sallou

-- 

gpg key id: 4096R/326D8438  (keyring.debian.org)

Key fingerprint = 5FB4 6F83 D3B9 5204 6335  D26D 78DC 68DB 326D 8438

[jira] [Updated] (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Robert Muir (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3994:


Attachment: LUCENE-3994.patch

Patch, removing n^2 growth in these tests, and some other tuning of atLeast.

In general, when tests like this hog the cpu for so long, we lose coverage 
overall.

I'll keep an eye on the nightlies for other cpu-hogs.

Here are the new timings for analyzers/ tests after the patch.

'ant test' with no multiplier:
{noformat}
BUILD SUCCESSFUL
Total time: 1 minute 28 seconds
{noformat}

'ant test -Dtests.multiplier=3 -Dtests.nightly=true'
{noformat}
BUILD SUCCESSFUL
Total time: 3 minutes 15 seconds
{noformat}


 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Robert Muir (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3994.
-

   Resolution: Fixed
Fix Version/s: 4.0

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3362) FacetComponent throws NPE when doing distributed query

2012-04-17 Thread Jamie Johnson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255700#comment-13255700
 ] 

Jamie Johnson commented on SOLR-3362:
-

Essentially doing something like this?

{code}
if(sfc == null){
  //log message
  continue;
}
{code}

 FacetComponent throws NPE when doing distributed query
 --

 Key: SOLR-3362
 URL: https://issues.apache.org/jira/browse/SOLR-3362
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: RHEL 
 lucene svn revision 1308309
Reporter: Jamie Johnson

 When executing a query against a field in my index I am getting the following 
 exception
 The query I am executing is as follows:
 http://host:port/solr/collection1/select?q=bobfacet=truefacet.field=organization
 debugging the FacetComponent line 489 sfc is null
 SEVERE: java.lang.NullPointerException
at 
 org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:489)
at 
 org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:278)
at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1550)
at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
at 
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3284) StreamingUpdateSolrServer swallows exceptions

[
https://issues.apache.org/jira/browse/SOLR-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255705#comment-13255705
]

Shawn Heisey commented on SOLR-3284:

After looking at existing tests to see how I might implement tests for this new
functionality, I couldn't see how to do it. Also, I noticed that there are
tests for SolrCloud and something else called ChaosMonkey. All tests in solr/
pass with this patch, but I don't know how SolrCloud might be affected. I
would hope that it already handles exceptions properly and therefore wouldn't
have any problems, but I have never looked at the code or used SolrCloud.

StreamingUpdateSolrServer swallows exceptions
-

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: Problem running all of a module's tests under IntelliJ: Wrong test finished.

2012-04-17 Thread Steven A Rowe

JetBrains has 7 issues in their issue tracker for IntelliJ IDEA that mention 
Wrong test finished. - all are marked closed  fixed.

http://youtrack.jetbrains.com/issues/IDEA?q=%22Wrong+test+finished%22

AFAICT, the problems mentioned in the above reports are of two main types:

1. Problem running concurrent tests (IDEA-54745).
2. Problem with exception thrown in @BeforeClass (IDEA-49505; IDEA-38287; 
IDEA-36591)

I tried adding -Dtests.jvms=1 to the run configuration for the 
analyzers-common module's tests, and IntelliJ still had the same problem (some 
tests didn't run  Wrong test finished), so I don't think the problem is #1.

Steve

-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu] 
Sent: Tuesday, April 17, 2012 12:05 PM
To: dev@lucene.apache.org
Subject: Problem running all of a module's tests under IntelliJ: Wrong test 
finished.

Hi Dawid :)

Do you use IntelliJ?  There appears to be some form of bad interaction between 
the new RandomizedTesting library additions and IntelliJ's test runner.

When I try to run all of an IntelliJ module's tests under IntelliJ, e.g. 
analyzers-common or lucene (including core and test-framework), not all tests 
run; those that don't run are reported as not started.  The external test 
process reports Wrong test finished. (???) and then returns exit code -1.

This behavior is relatively new - I don't think the modules/*-lucene/ move is 
the culprit (the IntelliJ lucene+test-framework module didn't move and it has 
this issue).
 
Here's the output from running all analyzers-common tests:

--
C:\Program Files\Java\jdk1.6.0_21\bin\java -ea -DtempDir=temp 
-Didea.launcher.port=7541 -Didea.launcher.bin.path=C:\Program Files 
(x86)\JetBrains\IntelliJ IDEA 11.1\bin -Dfile.encoding=UTF-8 -classpath 
C:\Program Files (x86)\JetBrains\IntelliJ IDEA 11.1\lib\idea_rt.jar;C:\Program 
Files (x86)\JetBrains\IntelliJ IDEA 
11.1\plugins\junit\lib\junit-rt.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\alt-rt.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\charsets.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\deploy.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\javaws.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\jce.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\jsse.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\management-agent.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\plugin.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\resources.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\rt.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\ext\dnsns.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\ext\localedata.jar;C:\Program 
Files\Java\jdk1.6.0_21\jre\lib\ext\sunjce_provider.jar;C:\svn\lucene\dev\trunk\lucene\build\analysis\analyzers-common\classes\test;C:\svn\lucene\dev\trunk\lucene\build\analysis\analyzers-common\classes\java;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\junit-4.10.jar;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\randomizedtesting-runner-1.2.0.jar;C:\svn\lucene\dev\trunk\lucene\build\lucene-idea\classes\test;C:\svn\lucene\dev\trunk\lucene\build\lucene-idea\classes\java;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\ant-1.7.1.jar;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\ant-junit-1.7.1.jar
 com.intellij.rt.execution.application.AppMain 
com.intellij.rt.execution.junit.JUnitStarter -ideVersion5 
@C:\Users\sarowe\AppData\Local\Temp\idea_junit3377604973713774012.tmp 
-socket53790

Test '.default package.WordBreakTestUnicode_6_0_0' ignored Test 
'org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testNastyPattern'
 ignored

Wrong test finished. Last started: [] stopped: 
testNastyPattern(org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter);
 class org.junit.runner.Description

Process finished with exit code -1
--


Steve

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional 
commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2889) Implement Adaptive Replacement Cache

[
https://issues.apache.org/jira/browse/SOLR-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255727#comment-13255727
]

Shawn Heisey commented on SOLR-2889:

bq. Would you mind posting some information about the results of your work and
how much performance gain you made. If you have benchmark results this would be
ideal. Did you notice any increase/decrease in memory and CPU demand?

I haven't done any extensive testing. The testing that I did do for SOLR-2906
suggested that the LFU cache did not offer any performance benefit over LRU,
but that it didn't really cause a performance detriment either. I think this
means that the idea was sound, but any speedups gained from the different
methodology were lost because of the basic and non-optimized implementation.

It was not a definitive test - I have two copies of my production distributed
index for redundancy purposes, with haproxy doing load balancing between the
two. I can set one set of servers to LFU and the other to LRU, but it's
production, so the two sets of servers never receive the same queries and I
don't really want to try any isolation tests on production equipment. My
testbed is too small for a doing tests with all production data - one server
with all resources smaller than production. I could do some tests with smaller
data sets that will fit entirely in RAM, but that will take a lot of planning
that I currently don't have time to do.

The LRU cache is highly optimized for speed, but I didn't really understand the
optimizations and they don't apply to LFU as far as I can tell. At this time I
am still using LRU cache because I don't dare change the production
configuration without authorization and I can't leave production servers in
test mode for very long.

Implement Adaptive Replacement Cache

Key: SOLR-2889
URL: https://issues.apache.org/jira/browse/SOLR-2889
Project: Solr
Issue Type: New Feature
Components: search
Affects Versions: 3.4
Reporter: Shawn Heisey
Priority: Minor

Currently Solr's caches are LRU, which doesn't look at hitcount to decide
which entries are most important. There is a method that takes both
frequency and time of cache hits into account:
http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache
If it's feasible, this could be a good addition to Solr/Lucene.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile


[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255730#comment-13255730
 ] 

Shawn Heisey commented on SOLR-1972:


Purely hypothetical stuff, probably way beyond my skills: Would it be possible 
(and useful) to use a Lucene index (RAMDirectory maybe) to store the query time 
data for performance reasons, or is the current array implementation good 
enough?


 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Shawn Heisey
Priority: Minor
 Attachments: SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-url_pattern.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972.patch, SOLR-1972.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-3.2.patch, elyograg-1972-trunk.patch, elyograg-1972-trunk.patch


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3362) FacetComponent throws NPE when doing distributed query

2012-04-17 Thread Yonik Seeley (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255746#comment-13255746
 ] 

Yonik Seeley commented on SOLR-3362:


Right.  I just checked something in for FacetComponent to log the term and 
other info.

 FacetComponent throws NPE when doing distributed query
 --

 Key: SOLR-3362
 URL: https://issues.apache.org/jira/browse/SOLR-3362
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: RHEL 
 lucene svn revision 1308309
Reporter: Jamie Johnson

 When executing a query against a field in my index I am getting the following 
 exception
 The query I am executing is as follows:
 http://host:port/solr/collection1/select?q=bobfacet=truefacet.field=organization
 debugging the FacetComponent line 489 sfc is null
 SEVERE: java.lang.NullPointerException
at 
 org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:489)
at 
 org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:278)
at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1550)
at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
at 
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3366) Restart of Solr during data import causes an empty index to be generated on restart

2012-04-17 Thread Kevin Osborn (Created) (JIRA)

Restart of Solr during data import causes an empty index to be generated on 
restart
---

 Key: SOLR-3366
 URL: https://issues.apache.org/jira/browse/SOLR-3366
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler, replication (java)
Affects Versions: 3.4
Reporter: Kevin Osborn


We use the DataImportHandler and Java replication in a fairly simple setup of a 
single master and 4 slaves. We had an operating index of about 16,000 
documents. The DataImportHandler is pulled periodically by an external service 
using the command=full-importclean=false command for a delta import.

While processing one of these commands, we did a deployment which required us 
to restart the application server (Tomcat 7). So, the import was interrupted. 
Prior to this deployment, the full index of 16,000 documents had been 
replicated to all slaves and was working correctly.

Upon restart, the master restarted with an empty index and then this empty 
index was replicated across all slaves. So, our search index was now empty.

My expected behavior was to lose any changes in the delta import (basically 
prior to the commit). However, I was not expecting to lose all data. Perhaps 
this is due to the fact that I am using the full-import method, even though it 
is really a delta, for performance reasons? Or does the data import just put 
the index in some sort of invalid state?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-3358) Capture Logging Events from JUL and Log4j

2012-04-17 Thread Ryan McKinley (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-3358:


Component/s: (was: web gui)
 Issue Type: New Feature  (was: Bug)
Summary: Capture Logging Events from JUL and Log4j  (was: Show Logging 
Events in Admin UI)

 Capture Logging Events from JUL and Log4j
 -

 Key: SOLR-3358
 URL: https://issues.apache.org/jira/browse/SOLR-3358
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Attachments: SOLR-3358-logging.patch, SOLR-3358-logging.patch


 The UI should be able to show the last few log messages.  To support this, we 
 will need to register an Appender (log4j) or Handler
 (JUL) and keep a buffer of recent log events.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-3358) Capture Logging Events from JUL and Log4j

2012-04-17 Thread Ryan McKinley (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-3358.
-

Resolution: Fixed

Added key infrastructure in r1327210. 

I will make new issues for ongoing work

 Capture Logging Events from JUL and Log4j
 -

 Key: SOLR-3358
 URL: https://issues.apache.org/jira/browse/SOLR-3358
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Attachments: SOLR-3358-logging.patch, SOLR-3358-logging.patch


 The UI should be able to show the last few log messages.  To support this, we 
 will need to register an Appender (log4j) or Handler
 (JUL) and keep a buffer of recent log events.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255790#comment-13255790
 ] 

Dawid Weiss commented on LUCENE-3994:
-

You could also update statistics -- remove the previous ones and run two three 
times, then update.

Alternatively, we could have jenkins update stats and fetch these from time to 
time.

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-178) add simple servlet displaying last N severe/warning logs

2012-04-17 Thread Ryan McKinley (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley resolved SOLR-178.


Resolution: Duplicate

This is implemented by SOLR-3358

 add simple servlet displaying last N severe/warning logs
 

 Key: SOLR-178
 URL: https://issues.apache.org/jira/browse/SOLR-178
 Project: Solr
  Issue Type: Wish
  Components: web gui
Reporter: Hoss Man

 From a discussion of masked errors when parsing hte schema/solrconfig...
 http://www.nabble.com/merely-a-suggestion%3A-schema.xml-validator-or-better-schema-validation-logging-tf3331929.html
 i've been thinking a Servlet that didn't depend on any special Solr code
 (so it will work even if SolrCore isn't initialized) but registeres a log
 handler and records the last N messages from Solr above a certain level
 would be handy to refer people to when they are having issues and aren't
 overly comfortable with log files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3367) Show Logging Events in Admin UI

2012-04-17 Thread Ryan McKinley (Created) (JIRA)

Show Logging Events in Admin UI
---

 Key: SOLR-3367
 URL: https://issues.apache.org/jira/browse/SOLR-3367
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley
 Fix For: 4.0


We can show logging events in the Admin UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255794#comment-13255794
 ] 

Robert Muir commented on LUCENE-3994:
-

I think statistics are mostly useless for nightly builds: since we pass huge 
multipliers and such?

If anything, this issue did more for the stats than any stats update could do, 
as these tests
now grow linearly instead of quadratically with the multiplier...

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENENET-485) IndexOutOfRangeException in FrenchStemmer

2012-04-17 Thread Christopher Currens (Created) (JIRA)

IndexOutOfRangeException in FrenchStemmer
-

 Key: LUCENENET-485
 URL: https://issues.apache.org/jira/browse/LUCENENET-485
 Project: Lucene.Net
  Issue Type: Bug
  Components: Lucene.Net Contrib
Affects Versions: Lucene.Net 3.0.3
Reporter: Christopher Currens
 Fix For: Lucene.Net 3.0.3


{quote}
Hi list,

I am not sure how to report bugs, or even if anybody is interested in bug 
reports. However, I have been playing with lucene lately, and found out an 
implementation bug in the Frenchstemmer 
(/src/contrib/Analyzers/Fr/FrenchStemmer.cs). Whenever I tried to add a new 
document to an index, I got an index out of range error. So I looked at the 
code and fixed that issue: see my diff file attached.

Please note that I also changed a few funky characters to unicode notation. The 
code worked well with the funky characters, but I think it just looks better 
with the \uxxx bits...

Anyways, the important bits is the replacement of a couple of sb.Insert by 
sb.Append.

I hope this helps.

Cheers,
Sylvain
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Buggy FrenchStemmer in trunk

2012-04-17 Thread Christopher Currens

Thanks, Sylvain, for the patch.

You can create issues at our issue tracker for Lucene.Net.
https://issues.apache.org/jira/browse/LUCENENET

You have a to create an account first to do it, but I've added your issue
here:  https://issues.apache.org/jira/browse/LUCENENET-485 and assigned it
to the 3.0.3 release.  If you decide to create an account, you can watch
the issue's progress, and I can assign you as the reporter, instead of me.


Thanks,
Christopher

On Sun, Apr 15, 2012 at 1:42 AM, Sylvain Rouillard sylv...@sylonline.bizwrote:

 Hi list,

 I am not sure how to report bugs, or even if anybody is interested in bug
 reports. However, I have been playing with lucene lately, and found out an
 implementation bug in the Frenchstemmer 
 (/src/contrib/Analyzers/Fr/**FrenchStemmer.cs).
 Whenever I tried to add a new document to an index, I got an index out of
 range error. So I looked at the code and fixed that issue: see my diff file
 attached.

 Please note that I also changed a few funky characters to unicode
 notation. The code worked well with the funky characters, but I think it
 just looks better with the \uxxx bits...

 Anyways, the important bits is the replacement of a couple of sb.Insert by
 sb.Append.

 I hope this helps.

 Cheers,
 Sylvain

Re: Problem running all of a module's tests under IntelliJ: Wrong test finished.

No, I don't use IntelliJ. I also don't know how they run their tests
but I suspect they use some hackish way to plug into junit runner (a
non-standard listener or something)?

My syspicion is that they pass Description objects as filters and
expect identical Description objects to appear on the listener's
output. This doesn't need to be the case and is not a contract
anywhere. Hard to tell, really.

I files an issue for this -
https://github.com/carrotsearch/randomizedtesting/issues/83

Dawid

On Tue, Apr 17, 2012 at 6:56 PM, Steven A Rowe sar...@syr.edu wrote:
 JetBrains has 7 issues in their issue tracker for IntelliJ IDEA that mention 
 Wrong test finished. - all are marked closed  fixed.

 http://youtrack.jetbrains.com/issues/IDEA?q=%22Wrong+test+finished%22

 AFAICT, the problems mentioned in the above reports are of two main types:

 1. Problem running concurrent tests (IDEA-54745).
 2. Problem with exception thrown in @BeforeClass (IDEA-49505; IDEA-38287; 
 IDEA-36591)

 I tried adding -Dtests.jvms=1 to the run configuration for the 
 analyzers-common module's tests, and IntelliJ still had the same problem 
 (some tests didn't run  Wrong test finished), so I don't think the problem 
 is #1.

 Steve

 -Original Message-
 From: Steven A Rowe [mailto:sar...@syr.edu]
 Sent: Tuesday, April 17, 2012 12:05 PM
 To: dev@lucene.apache.org
 Subject: Problem running all of a module's tests under IntelliJ: Wrong test 
 finished.

 Hi Dawid :)

 Do you use IntelliJ?  There appears to be some form of bad interaction 
 between the new RandomizedTesting library additions and IntelliJ's test 
 runner.

 When I try to run all of an IntelliJ module's tests under IntelliJ, e.g. 
 analyzers-common or lucene (including core and test-framework), not all tests 
 run; those that don't run are reported as not started.  The external test 
 process reports Wrong test finished. (???) and then returns exit code -1.

 This behavior is relatively new - I don't think the modules/*-lucene/ move 
 is the culprit (the IntelliJ lucene+test-framework module didn't move and it 
 has this issue).

 Here's the output from running all analyzers-common tests:

 --
 C:\Program Files\Java\jdk1.6.0_21\bin\java -ea -DtempDir=temp 
 -Didea.launcher.port=7541 -Didea.launcher.bin.path=C:\Program Files 
 (x86)\JetBrains\IntelliJ IDEA 11.1\bin -Dfile.encoding=UTF-8 -classpath 
 C:\Program Files (x86)\JetBrains\IntelliJ IDEA 
 11.1\lib\idea_rt.jar;C:\Program Files (x86)\JetBrains\IntelliJ IDEA 
 11.1\plugins\junit\lib\junit-rt.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\alt-rt.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\charsets.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\deploy.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\javaws.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\jce.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\jsse.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\management-agent.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\plugin.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\resources.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\rt.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\ext\dnsns.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\ext\localedata.jar;C:\Program 
 Files\Java\jdk1.6.0_21\jre\lib\ext\sunjce_provider.jar;C:\svn\lucene\dev\trunk\lucene\build\analysis\analyzers-common\classes\test;C:\svn\lucene\dev\trunk\lucene\build\analysis\analyzers-common\classes\java;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\junit-4.10.jar;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\randomizedtesting-runner-1.2.0.jar;C:\svn\lucene\dev\trunk\lucene\build\lucene-idea\classes\test;C:\svn\lucene\dev\trunk\lucene\build\lucene-idea\classes\java;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\ant-1.7.1.jar;C:\svn\lucene\dev\trunk\lucene\test-framework\lib\ant-junit-1.7.1.jar
  com.intellij.rt.execution.application.AppMain 
 com.intellij.rt.execution.junit.JUnitStarter -ideVersion5 
 @C:\Users\sarowe\AppData\Local\Temp\idea_junit3377604973713774012.tmp 
 -socket53790

 Test '.default package.WordBreakTestUnicode_6_0_0' ignored Test 
 'org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter.testNastyPattern'
  ignored

 Wrong test finished. Last started: [] stopped: 
 testNastyPattern(org.apache.lucene.analysis.pattern.TestPatternReplaceCharFilter);
  class org.junit.runner.Description

 Process finished with exit code -1
 --


 Steve

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional 
 commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail:

[jira] [Updated] (LUCENENET-485) IndexOutOfRangeException in FrenchStemmer

2012-04-17 Thread Christopher Currens (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/LUCENENET-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Christopher Currens updated LUCENENET-485:
--

Attachment: tt.diff

The diff doesn't specify files, but should be easy to figure out.

IndexOutOfRangeException in FrenchStemmer
-

Attachments: tt.diff

[jira] [Updated] (SOLR-3367) Show Logging Events in Admin UI

2012-04-17 Thread Ryan McKinley (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan McKinley updated SOLR-3367:


Component/s: web gui

 Show Logging Events in Admin UI
 ---

 Key: SOLR-3367
 URL: https://issues.apache.org/jira/browse/SOLR-3367
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Reporter: Ryan McKinley
 Fix For: 4.0


 We can show logging events in the Admin UI.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255803#comment-13255803
 ] 

Dawid Weiss commented on LUCENE-3994:
-

Ok. I'll recalculate them from time to time. There is a large variance in tests 
anyway (this can also be computed from log stats because we can keep a history 
of N runs... it'd be interesting to see which tests have the largest variance).

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255805#comment-13255805
 ] 

Robert Muir commented on LUCENE-3994:
-

Another thing i toned down here was the multithreaded testing in 
basetokenstreamtestcase,
there is something os-specific about freebsd's java that causes this to take a 
lot more time
than locally... thats why analysis tests take so long in nightly builds 
(especially with the n^2!)

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Upgrading Java on my Mac and IntelliJ

2012-04-17 Thread Erick Erickson

An interesting thing happened on my Mac when I upgraded to: version
1.6.0_31. I had both the 1.5 and 1.6 SDKs set up so I could
compile/run Solr under the correct versions depending upon whether I
was on 3.x or trunk. I did nothing special during the install, but at
the end 1.5 was no longer available in IntelliJ. Of course I could set
the compatibility level to 1.5, but there's no 1.5 SDK choice any
more.

I assume there's some setting somewhere that I could pick, but frankly
unless and until I have to do more 3.x work I'm not going to bother
looking.

Just in case anyone else is upgrading and wonders if they're the only
ones who are lucky..

Erick

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-3368) Index Logging Events

2012-04-17 Thread Ryan McKinley (Created) (JIRA)

Index Logging Events


 Key: SOLR-3368
 URL: https://issues.apache.org/jira/browse/SOLR-3368
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan McKinley


In SOLR-3358, we capture logging events and hold them in memory.  To support 
search and longer history, we could optionally store the events in a solr index

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3994) some nightly tests take hours


[ 
https://issues.apache.org/jira/browse/LUCENE-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255814#comment-13255814
 ] 

Robert Muir commented on LUCENE-3994:
-

{quote}
There is a large variance in tests anyway
{quote}

Like this? :)

https://builds.apache.org/job/Lucene-trunk/1896/testReport/org.apache.lucene.index/TestIndexWriterReader/history/

 some nightly tests take hours
 -

 Key: LUCENE-3994
 URL: https://issues.apache.org/jira/browse/LUCENE-3994
 Project: Lucene - Java
  Issue Type: Bug
  Components: general/build
Affects Versions: 4.0
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3994.patch


 The nightly builds are taking 4-7 hours.
 This is caused by a few bad apples (can be seen 
 https://builds.apache.org/job/Lucene-trunk/1896/testReport/).
 The top 5 are (all in analysis):
 * TestSynonymMapFilter: 1 hr 54 min
 * TestRandomChains: 1 hr 22 min
 * TestRemoveDuplicatesTokenFilter: 32 min
 * TestMappingCharFilter: 28 min
 * TestWordDelimiterFilter: 22 min
 so thats 4.5 hours right there for that run

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Wildcard queries are not analyzed

2012-04-17 Thread Christopher Currens

Thanks Björn.

So I've compared the code with the java equivalent, and the result from
java, via running the analyzer in the QueryParser:

Field:björ*

So, it seems to have the same behavior in Java as well.  I want to see if
this is a known issue or expected behavior in java, and go from there.  If
it is, can anyone think of any unexpected side effects to fixing this, so
björ* becomes bjor*?


Thanks,
Christopher


2012/4/17 Björn Kremer b...@patorg.de

 Hello,


 maybe I have found a little lucene problem: Wildcard queries are not
 analyzed correctly. I'm using the german analyzer with the
 'GermanDIN2Stemmer'.

 In the lucene-index my name('Björn') is stored as 'bjorn'. If I performe a
 wildcard query like 'björ*' the function 'GetPrefixQuery' does not analyze
 the search term. So the query result is 'björ*' instead of 'bjor*'. (björ*
 = no match, bjor* = match)


 Thank You
 Björn

Re: Wildcard queries are not analyzed

2012-04-17 Thread Christopher Currens

I should also add, that directly reading the token stream, will produce
bjor (no wildcard) from björ*.

Björn,

It would be great to see some example code that you're using to reproduce
this behavior, just to make sure we're testing it in the same way.  Also,
could I persuade you to create an issue for this here:
https://issues.apache.org/jira/browse/LUCENENET, so that we can keep track
of the progress on it?

Thanks,
Christopher

On Tue, Apr 17, 2012 at 11:34 AM, Christopher Currens 
currens.ch...@gmail.com wrote:

 Thanks Björn.

 So I've compared the code with the java equivalent, and the result from
 java, via running the analyzer in the QueryParser:

 Field:björ*

 So, it seems to have the same behavior in Java as well.  I want to see if
 this is a known issue or expected behavior in java, and go from there.  If
 it is, can anyone think of any unexpected side effects to fixing this, so
 björ* becomes bjor*?


 Thanks,
 Christopher


 2012/4/17 Björn Kremer b...@patorg.de

 Hello,


 maybe I have found a little lucene problem: Wildcard queries are not
 analyzed correctly. I'm using the german analyzer with the
 'GermanDIN2Stemmer'.

 In the lucene-index my name('Björn') is stored as 'bjorn'. If I performe
 a wildcard query like 'björ*' the function 'GetPrefixQuery' does not
 analyze the search term. So the query result is 'björ*' instead of 'bjor*'.
 (björ* = no match, bjor* = match)


 Thank You
 Björn

[jira] [Commented] (SOLR-3366) Restart of Solr during data import causes an empty index to be generated on restart

2012-04-17 Thread James Dyer (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255827#comment-13255827
 ] 

James Dyer commented on SOLR-3366:
--

I don't see how this would be related to DIH.  Even if you had clean=true, it 
doesn't commit the deletes until the entire update is complete.  So, like you 
say, we should expect to only lose the changes from the current import, not the 
entire index.

I wonder if this is a side-effect from using replication.  Sometimes, 
replication copies an entire new index to the slaves in a new directory, then 
writes this new directory to index.properties.  On restart solr looks for 
index.properties to find the appropriate index directory.  If this file had 
been touched or removed, possibly it restarted and didn't find the correct 
directory, then created a new index?  Of course, this would have affected the 
slaves only.

I vaguely remember there being a bug some releases back where index corruption 
could occur if the system is ungracefully shut down, and I see you're on 3.4.  
But then again, maybe my memory is failing me because I didn't see this in the 
release notes.

 Restart of Solr during data import causes an empty index to be generated on 
 restart
 ---

 Key: SOLR-3366
 URL: https://issues.apache.org/jira/browse/SOLR-3366
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler, replication (java)
Affects Versions: 3.4
Reporter: Kevin Osborn

 We use the DataImportHandler and Java replication in a fairly simple setup of 
 a single master and 4 slaves. We had an operating index of about 16,000 
 documents. The DataImportHandler is pulled periodically by an external 
 service using the command=full-importclean=false command for a delta 
 import.
 While processing one of these commands, we did a deployment which required us 
 to restart the application server (Tomcat 7). So, the import was interrupted. 
 Prior to this deployment, the full index of 16,000 documents had been 
 replicated to all slaves and was working correctly.
 Upon restart, the master restarted with an empty index and then this empty 
 index was replicated across all slaves. So, our search index was now empty.
 My expected behavior was to lose any changes in the delta import (basically 
 prior to the commit). However, I was not expecting to lose all data. Perhaps 
 this is due to the fact that I am using the full-import method, even though 
 it is really a delta, for performance reasons? Or does the data import just 
 put the index in some sort of invalid state?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: (LUCENE-3994) some nightly tests take hours

 Like this? :)

 https://builds.apache.org/job/Lucene-trunk/1896/testReport/org.apache.lucene.index/TestIndexWriterReader/history/

If it's correlated with the commit of RandomizedRunner then I'd check
if it's not that asserting Random instance or random() being
repeatedly called gazillion times. Like I said -- there is an extra
cost for these assertions so in tight loops (where there is no
possibility of Random escaping to another thread, etc.), I'd just
create a simple Random rnd = new Random(random().nextLong()).

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: (LUCENE-3994) some nightly tests take hours

I haven't looked, but i seriously doubt thats responsible at all.
Thats just an example crazy one.

More likely it got SimpleText codec :)

On Tue, Apr 17, 2012 at 3:49 PM, Dawid Weiss
dawid.we...@cs.put.poznan.pl wrote:
 Like this? :)

 https://builds.apache.org/job/Lucene-trunk/1896/testReport/org.apache.lucene.index/TestIndexWriterReader/history/

 If it's correlated with the commit of RandomizedRunner then I'd check
 if it's not that asserting Random instance or random() being
 repeatedly called gazillion times. Like I said -- there is an extra
 cost for these assertions so in tight loops (where there is no
 possibility of Random escaping to another thread, etc.), I'd just
 create a simple Random rnd = new Random(random().nextLong()).

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1972) Need additional query stats in admin interface - median, 95th and 99th percentile

2012-04-17 Thread Alan Woodward (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255895#comment-13255895
 ] 

Alan Woodward commented on SOLR-1972:
-

I wonder if Coda Hale's Metrics library might be worth using here?

http://metrics.codahale.com/

It already deals with rolling updates, and can expose measurements through JMX 
beans.

 Need additional query stats in admin interface - median, 95th and 99th 
 percentile
 -

 Key: SOLR-1972
 URL: https://issues.apache.org/jira/browse/SOLR-1972
 Project: Solr
  Issue Type: Improvement
Affects Versions: 1.4
Reporter: Shawn Heisey
Priority: Minor
 Attachments: SOLR-1972-branch3x-url_pattern.patch, 
 SOLR-1972-url_pattern.patch, SOLR-1972.patch, SOLR-1972.patch, 
 SOLR-1972.patch, SOLR-1972.patch, elyograg-1972-3.2.patch, 
 elyograg-1972-3.2.patch, elyograg-1972-trunk.patch, elyograg-1972-trunk.patch


 I would like to see more detailed query statistics from the admin GUI.  This 
 is what you can get now:
 requests : 809
 errors : 0
 timeouts : 0
 totalTime : 70053
 avgTimePerRequest : 86.59209
 avgRequestsPerSecond : 0.8148785 
 I'd like to see more data on the time per request - median, 95th percentile, 
 99th percentile, and any other statistical function that makes sense to 
 include.  In my environment, the first bunch of queries after startup tend to 
 take several seconds each.  I find that the average value tends to be useless 
 until it has several thousand queries under its belt and the caches are 
 thoroughly warmed.  The statistical functions I have mentioned would quickly 
 eliminate the influence of those initial slow queries.
 The system will have to store individual data about each query.  I don't know 
 if this is something Solr does already.  It would be nice to have a 
 configurable count of how many of the most recent data points are kept, to 
 control the amount of memory the feature uses.  The default value could be 
 something like 1024 or 4096.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

question about seed

I see: [junit4] JUnit4 says hello. Random seed: A773AE0846178A0

at the start of the JVM. does this mean that all tests run with the
same initial seed?

For example looking at tests, it seems it does, e.g. in a test run all
my tests will run with the same codec for example.
(see https://builds.apache.org/job/Lucene-trunk/1897/console, many
assumes for PreFlex)

From a coverage perspective, if this is the case, its not good because
it means we don't get good mixed coverage of the different codecs...
So you could easily make a change that breaks some codec horribly and
not know, unless you do many 'ant test' runs.

Can we fix the random to maybe initialize based on this seed + a hash
of the class name itself?

-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: question about seed

Here's an example demonstrating what i mean. enable the SOP with
-Dtests.jvms=1 (like hudson does) and tail -f the tests log

then make the change in how we compute 'randomVal' for codec selection
and run again:

Index: lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java
===
--- lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java   
(revision
1327235)
+++ lucene/test-framework/src/java/org/apache/lucene/util/LuceneTestCase.java   
(working
copy)
@@ -405,7 +405,7 @@
 PREFLEX_IMPERSONATION_IS_ACTIVE = false;
 savedCodec = Codec.getDefault();
 final Codec codec;
-int randomVal = random().nextInt(10);
+int randomVal = new Random(random().nextLong() ^
getTestClass().getSimpleName().hashCode()).nextInt(10);

 if (Lucene3x.equals(TEST_CODEC) || (random.equals(TEST_CODEC)
 randomVal  2)) { // preflex-only setup
   codec = Codec.forName(Lucene3x);
@@ -436,6 +436,7 @@
 }

 Codec.setDefault(codec);
+System.out.println(codec= + codec);

 savedLocale = Locale.getDefault();


On Tue, Apr 17, 2012 at 4:13 PM, Robert Muir rcm...@gmail.com wrote:
 I see: [junit4] JUnit4 says hello. Random seed: A773AE0846178A0

 at the start of the JVM. does this mean that all tests run with the
 same initial seed?

 For example looking at tests, it seems it does, e.g. in a test run all
 my tests will run with the same codec for example.
 (see https://builds.apache.org/job/Lucene-trunk/1897/console, many
 assumes for PreFlex)

 From a coverage perspective, if this is the case, its not good because
 it means we don't get good mixed coverage of the different codecs...
 So you could easily make a change that breaks some codec horribly and
 not know, unless you do many 'ant test' runs.

 Can we fix the random to maybe initialize based on this seed + a hash
 of the class name itself?

 --
 lucidimagination.com



-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3362) FacetComponent throws NPE when doing distributed query

2012-04-17 Thread Jamie Johnson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255907#comment-13255907
 ] 

Jamie Johnson commented on SOLR-3362:
-

This certainly made the error I was having go away, should I be worried about a 
lower level issue that this change masks?

 FacetComponent throws NPE when doing distributed query
 --

 Key: SOLR-3362
 URL: https://issues.apache.org/jira/browse/SOLR-3362
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: RHEL 
 lucene svn revision 1308309
Reporter: Jamie Johnson

 When executing a query against a field in my index I am getting the following 
 exception
 The query I am executing is as follows:
 http://host:port/solr/collection1/select?q=bobfacet=truefacet.field=organization
 debugging the FacetComponent line 489 sfc is null
 SEVERE: java.lang.NullPointerException
at 
 org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:489)
at 
 org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:278)
at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1550)
at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
at 
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3995) In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults

2012-04-17 Thread Robert Muir (Created) (JIRA)

In LuceneTestCase.beforeClass, make a new random (also using the class 
hashcode) to vary defaults
-

 Key: LUCENE-3995
 URL: https://issues.apache.org/jira/browse/LUCENE-3995
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir


In LuceneTestCase, we set many static defaults like:
* default codec
* default infostream impl
* default locale
* default timezone
* default similarity

Currently each test run gets a single seed for the run, which means for example 
across one test run
every single test will have say, SimpleText + infostream=off + Locale=german + 
timezone=EDT + similarity=BM25

Because of that, we lose lots of basic mixed coverage across tests, and it also 
means the unfortunate
individual who gets SimpleText or other slow options gets a REALLY SLOW test 
run, rather than amortizing
this across all test runs.

We should at least make a new random (getRandom() ^ className.hashCode()) to 
fix this so it works like before,
but unfortunately that only fixes it for LuceneTestCase.

Won't any subclasses that make random decisions in @BeforeClass (and we have 
many) still have the same problem?
Maybe RandomizedRunner can instead be improved here?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: (LUCENE-3994) some nightly tests take hours

2012-04-17 Thread Mark Miller (Issue Comment Edited) (JIRA)

Digging in a bit into the slow nightlies, I think this explains a lot
(see my other email: question about seed)

I opened https://issues.apache.org/jira/browse/LUCENE-3995, but
honestly i would really prefer if we didnt have to change our tests
here, if somehow, when we do getRandom() the class's hashcode or
something like that is xored with the seed to mix it up a bit, I think
it would be ok.

I can easily fix LuceneTestCase/SolrTestCaseJ4/etc but there are many
tests that do stuff in beforeClass (e.g. build a directory), so if all
tests are 'doing the same thing' we get less predictable test times,
and less test efficiency.

This is because there is redundancy in our tests (of course), which I
think is ok, actually good, as long as these defaults are varying
per-test-class such that we get wide coverage within a single 'ant
test'.

On Tue, Apr 17, 2012 at 3:53 PM, Robert Muir rcm...@gmail.com wrote:
 I haven't looked, but i seriously doubt thats responsible at all.
 Thats just an example crazy one.

 More likely it got SimpleText codec :)

 On Tue, Apr 17, 2012 at 3:49 PM, Dawid Weiss
 dawid.we...@cs.put.poznan.pl wrote:
 Like this? :)

 https://builds.apache.org/job/Lucene-trunk/1896/testReport/org.apache.lucene.index/TestIndexWriterReader/history/

 If it's correlated with the commit of RandomizedRunner then I'd check
 if it's not that asserting Random instance or random() being
 repeatedly called gazillion times. Like I said -- there is an extra
 cost for these assertions so in tight loops (where there is no
 possibility of Random escaping to another thread, etc.), I'd just
 create a simple Random rnd = new Random(random().nextLong()).

 Dawid

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 --
 lucidimagination.com



-- 
lucidimagination.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3362) FacetComponent throws NPE when doing distributed query

2012-04-17 Thread Yonik Seeley (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255915#comment-13255915
 ] 

Yonik Seeley commented on SOLR-3362:


bq. This certainly made the error I was having go away, should I be worried 
about a lower level issue that this change masks?

Yes, I'd be worried.  Do you see any Unexpected term returned errors in the 
logs?

 FacetComponent throws NPE when doing distributed query
 --

 Key: SOLR-3362
 URL: https://issues.apache.org/jira/browse/SOLR-3362
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.0
 Environment: RHEL 
 lucene svn revision 1308309
Reporter: Jamie Johnson

 When executing a query against a field in my index I am getting the following 
 exception
 The query I am executing is as follows:
 http://host:port/solr/collection1/select?q=bobfacet=truefacet.field=organization
 debugging the FacetComponent line 489 sfc is null
 SEVERE: java.lang.NullPointerException
at 
 org.apache.solr.handler.component.FacetComponent.refineFacets(FacetComponent.java:489)
at 
 org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:278)
at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1550)
at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
at 
 org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3174) Visualize Cluster State

2012-04-17 Thread Mark Miller (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255923#comment-13255923
 ] 

Mark Miller commented on SOLR-3174:
---

if not in live_nodes:
{noformat}
gone: gray
{noformat}

if in live_nodes:
{noformat}
active  : green
recovering  : yellow
down: orange
recovery_failed : red
{noformat}

 Visualize Cluster State
 ---

 Key: SOLR-3174
 URL: https://issues.apache.org/jira/browse/SOLR-3174
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Reporter: Ryan McKinley
Assignee: Stefan Matheis (steffkes)
 Attachments: SOLR-3174-graph.png, SOLR-3174-graph.png, 
 SOLR-3174-rgraph.png, SOLR-3174-rgraph.png, SOLR-3174.patch, SOLR-3174.patch, 
 SOLR-3174.patch, SOLR-3174.patch


 It would be great to visualize the cluster state in the new UI. 
 See Mark's wish:
 https://issues.apache.org/jira/browse/SOLR-3162?focusedCommentId=13218272page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13218272

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (SOLR-3174) Visualize Cluster State


[ 
https://issues.apache.org/jira/browse/SOLR-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255923#comment-13255923
 ] 

Mark Miller edited comment on SOLR-3174 at 4/17/12 8:50 PM:


if not in live_nodes:
it could have any state - ignore the state and make the color gray

if in live_nodes:
Use the following color based on the state string in the first column.
{noformat}
active  : green
recovering  : yellow
down: orange
recovery_failed : red
{noformat}

  was (Author: markrmil...@gmail.com):
if not in live_nodes:
{noformat}
gone: gray
{noformat}

if in live_nodes:
{noformat}
active  : green
recovering  : yellow
down: orange
recovery_failed : red
{noformat}
  
 Visualize Cluster State
 ---

 Key: SOLR-3174
 URL: https://issues.apache.org/jira/browse/SOLR-3174
 Project: Solr
  Issue Type: New Feature
  Components: web gui
Reporter: Ryan McKinley
Assignee: Stefan Matheis (steffkes)
 Attachments: SOLR-3174-graph.png, SOLR-3174-graph.png, 
 SOLR-3174-rgraph.png, SOLR-3174-rgraph.png, SOLR-3174.patch, SOLR-3174.patch, 
 SOLR-3174.patch, SOLR-3174.patch


 It would be great to visualize the cluster state in the new UI. 
 See Mark's wish:
 https://issues.apache.org/jira/browse/SOLR-3162?focusedCommentId=13218272page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13218272

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-3995) In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults

2012-04-17 Thread Dawid Weiss (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss reassigned LUCENE-3995:
---

Assignee: Dawid Weiss

 In LuceneTestCase.beforeClass, make a new random (also using the class 
 hashcode) to vary defaults
 -

 Key: LUCENE-3995
 URL: https://issues.apache.org/jira/browse/LUCENE-3995
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Dawid Weiss

 In LuceneTestCase, we set many static defaults like:
 * default codec
 * default infostream impl
 * default locale
 * default timezone
 * default similarity
 Currently each test run gets a single seed for the run, which means for 
 example across one test run
 every single test will have say, SimpleText + infostream=off + Locale=german 
 + timezone=EDT + similarity=BM25
 Because of that, we lose lots of basic mixed coverage across tests, and it 
 also means the unfortunate
 individual who gets SimpleText or other slow options gets a REALLY SLOW test 
 run, rather than amortizing
 this across all test runs.
 We should at least make a new random (getRandom() ^ className.hashCode()) to 
 fix this so it works like before,
 but unfortunately that only fixes it for LuceneTestCase.
 Won't any subclasses that make random decisions in @BeforeClass (and we have 
 many) still have the same problem?
 Maybe RandomizedRunner can instead be improved here?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: question about seed

Understood and filed as a new feature.

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3993) Polishing annoyances from JUnit4

2012-04-17 Thread Dawid Weiss (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-3993:


Description: 
- @Ignore and @TestGroup-ignored tests should report the reason much like 
assumption-ignored tests. 
  https://github.com/carrotsearch/randomizedtesting/issues/82
- perturb randomness in @BeforeClass hooks so that bad apples are more evently 
distributed across suites. 
  https://issues.apache.org/jira/browse/LUCENE-3995
- IntelliJ Idea test configs
  https://github.com/carrotsearch/randomizedtesting/issues/83

  was:
- @Ignore and @TestGroup-ignored tests should report the reason much like 
assumption-ignored tests. 
  https://github.com/carrotsearch/randomizedtesting/issues/82



 Polishing annoyances from JUnit4
 

 Key: LUCENE-3993
 URL: https://issues.apache.org/jira/browse/LUCENE-3993
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: general/build, general/test
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.0


 - @Ignore and @TestGroup-ignored tests should report the reason much like 
 assumption-ignored tests. 
   https://github.com/carrotsearch/randomizedtesting/issues/82
 - perturb randomness in @BeforeClass hooks so that bad apples are more 
 evently distributed across suites. 
   https://issues.apache.org/jira/browse/LUCENE-3995
 - IntelliJ Idea test configs
   https://github.com/carrotsearch/randomizedtesting/issues/83

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3995) In LuceneTestCase.beforeClass, make a new random (also using the class hashcode) to vary defaults

2012-04-17 Thread Dawid Weiss (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255961#comment-13255961
 ] 

Dawid Weiss commented on LUCENE-3995:
-

Note to myself - this also affectes test coverage because it reduces static 
context entropy (as pointed by Robert, Uwe).

 In LuceneTestCase.beforeClass, make a new random (also using the class 
 hashcode) to vary defaults
 -

 Key: LUCENE-3995
 URL: https://issues.apache.org/jira/browse/LUCENE-3995
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/test
Affects Versions: 4.0
Reporter: Robert Muir
Assignee: Dawid Weiss

 In LuceneTestCase, we set many static defaults like:
 * default codec
 * default infostream impl
 * default locale
 * default timezone
 * default similarity
 Currently each test run gets a single seed for the run, which means for 
 example across one test run
 every single test will have say, SimpleText + infostream=off + Locale=german 
 + timezone=EDT + similarity=BM25
 Because of that, we lose lots of basic mixed coverage across tests, and it 
 also means the unfortunate
 individual who gets SimpleText or other slow options gets a REALLY SLOW test 
 run, rather than amortizing
 this across all test runs.
 We should at least make a new random (getRandom() ^ className.hashCode()) to 
 fix this so it works like before,
 but unfortunately that only fixes it for LuceneTestCase.
 Won't any subclasses that make random decisions in @BeforeClass (and we have 
 many) still have the same problem?
 Maybe RandomizedRunner can instead be improved here?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Problem running all of a module's tests under IntelliJ: Wrong test finished.