[jira] [Commented] (SOLR-3161) Use of 'qt' should be restricted to searching and should not start with a '/'

2012-03-20 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233251#comment-13233251
 ] 

David Smiley commented on SOLR-3161:


As long as you provide a leading '/' to shards.qt, there is no problem because 
the sharded request will use that as the path and not use 'qt'.  The smarts 
that make that happen is largely due to the logic in QueryRequest.getPath().  I 
just played around with this in tests and stepped through the code to prove it 
out.

This does remind me of another attack vector of sorts for what started all 
this.  Even with qt disabled, this still leaves the possibility of 
/mysearch?q=...shards=...shards.qt=/update

 Use of 'qt' should be restricted to searching and should not start with a '/'
 -

 Key: SOLR-3161
 URL: https://issues.apache.org/jira/browse/SOLR-3161
 Project: Solr
  Issue Type: Improvement
  Components: search, web gui
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 3.6, 4.0

 Attachments: SOLR-3161-disable-qt-by-default.patch, 
 SOLR-3161-dispatching-request-handler.patch, 
 SOLR-3161-dispatching-request-handler.patch


 I haven't yet looked at the code involved for suggestions here; I'm speaking 
 based on how I think things should work and not work, based on intuitiveness 
 and security. In general I feel it is best practice to use '/' leading 
 request handler names and not use qt, but I don't hate it enough when used 
 in limited (search-only) circumstances to propose its demise. But if someone 
 proposes its deprecation that then I am +1 for that.
 Here is my proposal:
 Solr should error if the parameter qt is supplied with a leading '/'. 
 (trunk only)
 Solr should only honor qt if the target request handler extends 
 solr.SearchHandler.
 The new admin UI should only use 'qt' when it has to. For the query screen, 
 it could present a little pop-up menu of handlers to choose from, including 
 /select?qt=mycustom for handlers that aren't named with a leading '/'. This 
 choice should be positioned at the top.
 And before I forget, me or someone should investigate if there are any 
 similar security problems with the shards.qt parameter. Perhaps shards.qt can 
 abide by the same rules outlined above.
 Does anyone foresee any problems with this proposal?
 On a related subject, I think the notion of a default request handler is bad 
 - the default=true thing. Honestly I'm not sure what it does, since I 
 noticed Solr trunk redirects '/solr/' to the new admin UI at '/solr/#/'. 
 Assuming it doesn't do anything useful anymore, I think it would be clearer 
 to use requestHandler name=/select class=solr.SearchHandler instead of 
 what's there now. The delta is to put the leading '/' on this request handler 
 name, and remove the default attribute.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: any general way of getting which attributes token stream has?

2012-03-20 Thread Koji Sekiguchi

(12/03/20 13:47), Robert Muir wrote:

I think we should probably change the QueryConverter api from:
 public abstract CollectionToken  convert(String original);
to:
 public abstract TokenStream convert(original)

Currently attributes such as ReadingAttribute are lost...

If we really want a Collection we could alternatively have
CollectionAttributeSource  which would also preserve attributes, but
it seems silly when QueryConverter could just return a TokenStream.

This makes SuggestQueryConverter extremely simple :)
In fact SpellingQueryConvert could be simple too: I think its
basically really just is a regex-tokenizer with a stopword list
(OR/AND) ?


Hi Robert,

Thanks for the comment.

As I'm investigating further the Lucene spell checker for Japanese,
I've realized that there is more essential problem in it. I'll open a
JIRA ticket for it shortly. In the ticket, I change the api you mentioned
if needed.

koji
--
Query Log Visualizer for Apache Solr
http://soleami.com/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary

2012-03-20 Thread Koji Sekiguchi (Created) (JIRA)
split off the spell check word and surface form in spell check dictionary
-

 Key: LUCENE-3888
 URL: https://issues.apache.org/jira/browse/LUCENE-3888
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Koji Sekiguchi
Priority: Minor
 Fix For: 3.6, 4.0


The did you mean? feature by using Lucene's spell checker cannot work well 
for Japanese environment unfortunately and is the longstanding problem, because 
the logic needs comparatively long text to check spells, but for some languages 
(e.g. Japanese), most words are too short to use the spell checker.

I think, for at least Japanese, the things can be improved if we split off the 
spell check word and surface form in the spell check dictionary. Then we can 
use ReadingAttribute for spell checking but CharTermAttribute for suggesting, 
for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: any general way of getting which attributes token stream has?

2012-03-20 Thread Mikhail Khludnev
Hello Koji,

Can't it be done via tokenStrem.reflectWith(AttributeReflector) with
reflector which puts all attrs properties into Token via reflection or into
AttributeSource?
WDYT?

2012/3/20 Koji Sekiguchi k...@r.email.ne.jp

 Is there any general way of getting/looking what attributes a token stream
 has?

 I want to use spell checker with a query analyzer, which the analyzer
 generates
 ReadingAttribute for each tokens, and I want to use the ReadingAttributes
 for
 spell checking. I think I can have my own SpellingQueryConverter extension
 to
 override analyze method, but I saw the TODO comment in
 SpellingQueryConverter:

  protected void analyze(CollectionToken result, Reader text, int offset)
 throws IOException {
TokenStream stream = analyzer.reusableTokenStream(, text);
// TODO: support custom attributes
CharTermAttribute termAtt =
 stream.addAttribute(CharTermAttribute.class);
FlagsAttribute flagsAtt = stream.addAttribute(FlagsAttribute.class);
TypeAttribute typeAtt = stream.addAttribute(TypeAttribute.class);
PayloadAttribute payloadAtt =
 stream.addAttribute(PayloadAttribute.class);
PositionIncrementAttribute posIncAtt =
 stream.addAttribute(PositionIncrementAttribute.class);
OffsetAttribute offsetAtt = stream.addAttribute(OffsetAttribute.class);
   :

 If we can have a general way of getting such information, I think it would
 be helpful
 not only for spell checking. (For example, SynonymFilter can add
 PartOfSpeechAttribute
 if the original token has.)

 koji
 --
 Query Log Visualizer for Apache Solr
 http://soleami.com/

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


[jira] [Updated] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary

2012-03-20 Thread Koji Sekiguchi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Sekiguchi updated LUCENE-3888:
---

Attachment: LUCENE-3888.patch

The patch cannot be compiled now because I changed the return type of the 
method in Dictionary interface but all implemented classes have not been 
changed.

Please give some comment because I'm new to spell checker. If no problem to go, 
I'll continue to work.

 split off the spell check word and surface form in spell check dictionary
 -

 Key: LUCENE-3888
 URL: https://issues.apache.org/jira/browse/LUCENE-3888
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Koji Sekiguchi
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3888.patch


 The did you mean? feature by using Lucene's spell checker cannot work well 
 for Japanese environment unfortunately and is the longstanding problem, 
 because the logic needs comparatively long text to check spells, but for some 
 languages (e.g. Japanese), most words are too short to use the spell checker.
 I think, for at least Japanese, the things can be improved if we split off 
 the spell check word and surface form in the spell check dictionary. Then we 
 can use ReadingAttribute for spell checking but CharTermAttribute for 
 suggesting, for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3683) Add @Noisy annotation for uncontrollably noisy tests

2012-03-20 Thread Dawid Weiss (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-3683:


Fix Version/s: 4.0

 Add @Noisy annotation for uncontrollably noisy tests
 

 Key: LUCENE-3683
 URL: https://issues.apache.org/jira/browse/LUCENE-3683
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
Assignee: Dawid Weiss
 Fix For: 4.0

 Attachments: LUCENE-LUCENE3808-JOB1-142.log


 {code}
   /**
* Annotation for test classes that are uncontrollably loud, and you 
* only want output if they actually fail, error, or VERBOSE is enabled.
* @deprecated Fix your test to properly use {@link #VERBOSE} !
*/
   @Documented
   @Deprecated
   @Target(ElementType.TYPE)
   @Retention(RetentionPolicy.RUNTIME)
   public @interface Noisy {}
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary

2012-03-20 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233291#comment-13233291
 ] 

Robert Muir commented on LUCENE-3888:
-

Koji: hmm I think the problem is not in the Dictionary interface (which is 
actually ok),
but instead in the spellcheckers and suggesters themselves?

For spellchecking, I think we need to expose more Analysis options in 
Spellchecker:
currently this is actually hardcoded at KeywordAnalyzer (it uses NOT_ANALYZED). 
Instead I think you should be able to pass Analyzer: we would also
have a TokenFilter for Japanese that replaces term text with Reading from 
ReadingAttribute.

In the same way, suggest can analyze too. (LUCENE-3842 is already some work for 
that, especially
with the idea to support Japanese this exact same way).

So in short I think we should:
# create a TokenFilter (similar to BaseFormFilter) which copies 
ReadingAttribute into termAtt.
# refactor the 'n-gram analysis' in spellchecker to work on actual tokenstreams 
(this can
  also likely be implemented as tokenstreams), allowing user to set an Analyzer 
on Spellchecker
  to control how it analyzes text.
# continue to work on 'analysis for suggest' like LUCENE-3842.

Note this use of analyzers in spellcheck/suggest is unrelated to Solr's current 
use of 'analyzers' 
which is only for some query manipulation and not very useful.


 split off the spell check word and surface form in spell check dictionary
 -

 Key: LUCENE-3888
 URL: https://issues.apache.org/jira/browse/LUCENE-3888
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Koji Sekiguchi
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3888.patch


 The did you mean? feature by using Lucene's spell checker cannot work well 
 for Japanese environment unfortunately and is the longstanding problem, 
 because the logic needs comparatively long text to check spells, but for some 
 languages (e.g. Japanese), most words are too short to use the spell checker.
 I think, for at least Japanese, the things can be improved if we split off 
 the spell check word and surface form in the spell check dictionary. Then we 
 can use ReadingAttribute for spell checking but CharTermAttribute for 
 suggesting, for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3868) Thread interruptions shouldn't cause unhandled thread errors (or should they?).

2012-03-20 Thread Dawid Weiss (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-3868:


Fix Version/s: (was: flexscoring branch)
   4.0

 Thread interruptions shouldn't cause unhandled thread errors (or should 
 they?).
 ---

 Key: LUCENE-3868
 URL: https://issues.apache.org/jira/browse/LUCENE-3868
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.0


 This is a result of pulling uncaught exception catching to a rule above 
 interrupt in internalTearDown(); check how it was before and restore previous 
 behavior?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3206) FST package API refactoring

2012-03-20 Thread Dawid Weiss (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated LUCENE-3206:


Affects Version/s: (was: 3.2)
Fix Version/s: (was: flexscoring branch)
   4.0

 FST package API refactoring
 ---

 Key: LUCENE-3206
 URL: https://issues.apache.org/jira/browse/LUCENE-3206
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/FSTs
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3206.patch


 The current API is still marked @experimental, so I think there's still time 
 to fiddle with it. I've been using the current API for some time and I do 
 have some ideas for improvement. This is a placeholder for these -- I'll post 
 a patch once I have a working proof of concept.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Using term offsets for hit highlighting

2012-03-20 Thread Alan Woodward
Thanks for all the offers of help!  It looks as though most of the hard work 
has already been done, which is exactly where I like to pick up projects.  :-)

Maybe the best place to start would be for me to rebase the branch against 
trunk, and see what still fits?  I think there have been some fairly major 
changes in the internals since July last year.

On 19 Mar 2012, at 17:07, Mike Sokolov wrote:

 I posted a patch with a Collector somewhat similar to what you described, 
 Alan - it's attached to one of the sub-issues 
 https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a fairly 
 complete alpha state, but has seen no production use of course, since it 
 relies on the remainder of the unfinished work in that branch.  It works by 
 creating a TokenStream based on match positions returned from the query and 
 passing that to the existing Highlighter.  Please feel free to get in touch 
 if you decide to look into that and have questions.
 
 
 -Mike
 
 On 03/19/2012 11:51 AM, Simon Willnauer wrote:
 On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindleru...@thetaphi.de  wrote:
   
 Have you marked that for GSOC? Would be a good idea!
 
  yes I did
   
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
 
 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Monday, March 19, 2012 4:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Using term offsets for hit highlighting
 
 Alan, you made my day!
 
 The branch is kind of outdated but I looked at it lately and I can 
 certainly help
 to get it up to speed. The feature in that branch is quite a big one and 
 its in a
 very early stage. Still I want to encourage you to take a look and work on 
 it. I
 promise all my help with the issues!
 
 let me know if you have questions!
 
 simon
 
 On Mon, Mar 19, 2012 at 3:52 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:
   
 Cool, thanks Robert.  I'll take a look at the JIRA ticket.
 
 On 19 Mar 2012, at 14:44, Robert Muir wrote:
 
 
 On Mon, Mar 19, 2012 at 10:38 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:
   
 Hello,
 
 The project I'm currently working on requires the reporting of exact
 hit positions from some pretty hairy queries, not all of which are
 covered by the existing highlighter modules.  I'm working round this
 by translating everything into SpanQueries, and using the getSpans()
 method to locate hits (I've extended the Spans interface to make
 term offsets available - see
 https://issues.apache.org/jira/browse/LUCENE-3826).  This works for
 our use-case, but isn't terribly efficient, and obviously isn't 
 applicable to
 
 non-Span queries.
   
 I've seen a bit of chatter on the list about using term offsets to
 provide accurate highlighting in Lucene.  I'm going to have a couple
 of weeks free in April, and I thought I might have a go at
 implementing this.  Mainly I'm wondering if there's already been
 thoughts about how to do it.  My current thoughts are to somehow
 extend the Weight and Scorer interface to make term offsets
 available; to get highlights for a given set of documents, you'd
 essentially run the query again, with a filter on just the documents
 you want highlighted, and have a custom collector that gets the term
 
 offsets in place of the scores.
   
 
 Hi Alan, Simon started some initial work on
 https://issues.apache.org/jira/browse/LUCENE-2878
 
 Some work and prototypes were done in a branch, but it might be
 lagging behind trunk a bit.
 
 Additionally at the time it was first done, I think we didn't yet
 support offsets in the postings lists.
 We've since added this and several codecs support it.
 
 --
 lucidimagination.com
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional commands, e-mail: dev-h...@lucene.apache.org
 
   
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org
   
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
   
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: 

[jira] [Updated] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary

2012-03-20 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3888:


Attachment: LUCENE-3888.patch

Here is a simple prototype of what I was suggesting, allows you to specify 
Analyzer to SpellChecker.

This Analyzer converts the 'surface form' into 'analyzed form' at index and 
query time: at index-time it forms n-grams based on the analyzed form, but 
stores the surface form for retrieval.

At query-time we have a similar process: the docFreq() etc checks are done on 
the surface form, but the actual spellchecking on the analyzed form.

The default Analyzer is null which means do nothing, and the patch has no 
tests, refactoring, or any of that.


 split off the spell check word and surface form in spell check dictionary
 -

 Key: LUCENE-3888
 URL: https://issues.apache.org/jira/browse/LUCENE-3888
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Koji Sekiguchi
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3888.patch, LUCENE-3888.patch


 The did you mean? feature by using Lucene's spell checker cannot work well 
 for Japanese environment unfortunately and is the longstanding problem, 
 because the logic needs comparatively long text to check spells, but for some 
 languages (e.g. Japanese), most words are too short to use the spell checker.
 I think, for at least Japanese, the things can be improved if we split off 
 the spell check word and surface form in the spell check dictionary. Then we 
 can use ReadingAttribute for spell checking but CharTermAttribute for 
 suggesting, for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary

2012-03-20 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3888:


Attachment: LUCENE-3888.patch

fix the obvious reset() problem... the real problem is I need to reset() my 
coffee mug.

 split off the spell check word and surface form in spell check dictionary
 -

 Key: LUCENE-3888
 URL: https://issues.apache.org/jira/browse/LUCENE-3888
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/spellchecker
Reporter: Koji Sekiguchi
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3888.patch, LUCENE-3888.patch, LUCENE-3888.patch


 The did you mean? feature by using Lucene's spell checker cannot work well 
 for Japanese environment unfortunately and is the longstanding problem, 
 because the logic needs comparatively long text to check spells, but for some 
 languages (e.g. Japanese), most words are too short to use the spell checker.
 I think, for at least Japanese, the things can be improved if we split off 
 the spell check word and surface form in the spell check dictionary. Then we 
 can use ReadingAttribute for spell checking but CharTermAttribute for 
 suggesting, for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3889) Remove/Uncommit SegmentingTokenizerBase

2012-03-20 Thread Robert Muir (Created) (JIRA)
Remove/Uncommit SegmentingTokenizerBase
---

 Key: LUCENE-3889
 URL: https://issues.apache.org/jira/browse/LUCENE-3889
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.6, 4.0
Reporter: Robert Muir


I added this class in LUCENE-3305 to support analyzers like Kuromoji,
but Kuromoji no longer needs it as of LUCENE-3767. So now nothing uses it.

I think we should uncommit before releasing, svn doesn't forget so
we can add this back if we want to refactor something like Thai or Smartcn
to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3889) Remove/Uncommit SegmentingTokenizerBase

2012-03-20 Thread Robert Muir (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3889:


Attachment: LUCENE-3889.patch

 Remove/Uncommit SegmentingTokenizerBase
 ---

 Key: LUCENE-3889
 URL: https://issues.apache.org/jira/browse/LUCENE-3889
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.6, 4.0
Reporter: Robert Muir
 Attachments: LUCENE-3889.patch


 I added this class in LUCENE-3305 to support analyzers like Kuromoji,
 but Kuromoji no longer needs it as of LUCENE-3767. So now nothing uses it.
 I think we should uncommit before releasing, svn doesn't forget so
 we can add this back if we want to refactor something like Thai or Smartcn
 to use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2020) HttpComponentsSolrServer

2012-03-20 Thread Sami Siren (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sami Siren updated SOLR-2020:
-

Attachment: SOLR-2020.patch

Improved patch with cleanups + additional tests.

 HttpComponentsSolrServer
 

 Key: SOLR-2020
 URL: https://issues.apache.org/jira/browse/SOLR-2020
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.4.1
 Environment: Any
Reporter: Chantal Ackermann
Priority: Minor
 Fix For: 4.0

 Attachments: HttpComponentsSolrServer.java, 
 HttpComponentsSolrServerTest.java, SOLR-2020-HttpSolrServer.patch, 
 SOLR-2020.patch, SOLR-2020.patch


 Implementation of SolrServer that uses the Apache Http Components framework.
 Http Components (http://hc.apache.org/) is the successor of Commons 
 HttpClient and thus HttpComponentsSolrServer would be a successor of 
 CommonsHttpSolrServer, in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-445) Update Handlers abort with bad documents

2012-03-20 Thread Erick Erickson (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-445:


Fix Version/s: (was: 3.6)

 Update Handlers abort with bad documents
 

 Key: SOLR-445
 URL: https://issues.apache.org/jira/browse/SOLR-445
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.3
Reporter: Will Johnson
Assignee: Erick Erickson
 Fix For: 4.0

 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, 
 SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml


 Has anyone run into the problem of handling bad documents / failures mid 
 batch.  Ie:
 add
   doc
 field name=id1/field
   /doc
   doc
 field name=id2/field
 field name=myDateFieldI_AM_A_BAD_DATE/field
   /doc
   doc
 field name=id3/field
   /doc
 /add
 Right now solr adds the first doc and then aborts.  It would seem like it 
 should either fail the entire batch or log a message/return a code and then 
 continue on to add doc 3.  Option 1 would seem to be much harder to 
 accomplish and possibly require more memory while Option 2 would require more 
 information to come back from the API.  I'm about to dig into this but I 
 thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-2242) Get distinct count of names for a facet field

2012-03-20 Thread Erick Erickson (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-2242:


Assignee: (was: Erick Erickson)

I won't get to this for 3.6

 Get distinct count of names for a facet field
 -

 Key: SOLR-2242
 URL: https://issues.apache.org/jira/browse/SOLR-2242
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Affects Versions: 4.0
Reporter: Bill Bell
Priority: Minor
 Fix For: 4.0

 Attachments: NumFacetTermsFacetsTest.java, 
 SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, 
 SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, 
 SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, 
 SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, 
 SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch


 When returning facet.field=name of field you will get a list of matches for 
 distinct values. This is normal behavior. This patch tells you how many 
 distinct values you have (# of rows). Use with limit=-1 and mincount=1.
 The feature is called namedistinct. Here is an example:
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price
 This currently only works on facet.field.
 {code}
 lst name=facet_fields
   lst name=price
 int name=numFacetTerms14/int
 int name=0.03/intint name=11.51/intint 
 name=19.951/intint name=74.991/intint name=92.01/intint 
 name=179.991/intint name=185.01/intint name=279.951/intint 
 name=329.951/intint name=350.01/intint name=399.01/intint 
 name=479.951/intint name=649.991/intint name=2199.01/int
   /lst
 /lst
 {code} 
 Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2921) Make any Filters, Tokenizers and CharFilters implement MultiTermAwareComponent if they should

2012-03-20 Thread Erick Erickson (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-2921:
-

Affects Version/s: (was: 3.6)

 Make any Filters, Tokenizers and CharFilters implement 
 MultiTermAwareComponent if they should
 -

 Key: SOLR-2921
 URL: https://issues.apache.org/jira/browse/SOLR-2921
 Project: Solr
  Issue Type: Improvement
  Components: Schema and Analysis
Affects Versions: 4.0
 Environment: All
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor

 SOLR-2438 creates a new MultiTermAwareComponent interface. This allows Solr 
 to automatically assemble a multiterm analyzer that does the right thing 
 vis-a-vis transforming the individual terms of a multi-term query at query 
 time. Examples are: lower casing, folding accents, etc. Currently 
 (27-Nov-2011), the following classes implement MultiTermAwareComponent:
  * ASCIIFoldingFilterFactory
  * LowerCaseFilterFactory
  * LowerCaseTokenizerFactory
  * MappingCharFilterFactory
  * PersianCharFilterFactory
 When users put any of the above in their query analyzer, Solr will do the 
 right thing at query time and the perennial question users have, why didn't 
 my wildcard query automatically lower-case (or accent fold or) my terms? 
 will be gone. Die question die!
 But taking a quick look, for instance, at the various FilterFactories that 
 exist, there are a number of possibilities that *might* be good candidates 
 for implementing MultiTermAwareComponent. But I really don't understand the 
 correct behavior here well enough to know whether these should implement the 
 interface or not. And this doesn't include other CharFilters or Tokenizers.
 Actually implementing the interface is often trivial, see the classes above 
 for examples. Note that LowerCaseTokenizerFactory returns a *Filter*, which 
 is the right thing in this case.
 Here is a quick cull of the Filters that, just from their names, might be 
 candidates. If anyone wants to take any of them on, that would be great. If 
 all you can do is provide test cases, I could probably do the code part, just 
 let me know.
 ArabicNormalizationFilterFactory
 GreekLowerCaseFilterFactory
 HindiNormalizationFilterFactory
 ICUFoldingFilterFactory
 ICUNormalizer2FilterFactory
 ICUTransformFilterFactory
 IndicNormalizationFilterFactory
 ISOLatin1AccentFilterFactory
 PersianNormalizationFilterFactory
 RussianLowerCaseFilterFactory
 TurkishLowerCaseFilterFactory

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-3182) If there is only one core, let it be the default without specifying a default in solr.xml

2012-03-20 Thread Erick Erickson (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned SOLR-3182:


Assignee: (was: Erick Erickson)

Don't have time to get to this in 3.6, does someone else want to push this 
forward?

 If there is only one core, let it be the default without specifying a default 
 in solr.xml
 -

 Key: SOLR-3182
 URL: https://issues.apache.org/jira/browse/SOLR-3182
 Project: Solr
  Issue Type: Improvement
  Components: multicore
Affects Versions: 3.6, 4.0
Reporter: Russell Black
Priority: Minor
  Labels: patch
 Attachments: SOLR-3182-default-core.patch

   Original Estimate: 10m
  Remaining Estimate: 10m

 Our particular need for this is as follows.  We operate in a sharded 
 environment with one core per server.  Each shard also acts as a collator.  
 We want to use a hardware load balancer to choose which shard will do the 
 collation for each query.  But in order to do that, each server's single core 
 would have to carry the same name so that it could be accessed by the same 
 url across servers.  However we name the cores by their shard number 
 (query0,query1,...) because it parallels with the way we name our 
 indexing/master cores (index0, index1,...).  This naming convention also 
 gives us the flexibility of moving to a multicore environment in the future 
 without having to rename the cores, although admittedly that would complicate 
 load balancing.  
 In a system with a large number of shards and the anticipation of adding more 
 going forward, setting a defaultCoreName attribute in each solr.xml file 
 becomes inconvenient, especially since there is no Solr admin API for setting 
 defaultCoreName.  It would have to be done by hand or with some automated 
 tool we would write in house.  Even if there were an API, logically it seems 
 unnecessary to have to declare the only core to be the default. 
 Fortunately this behavior can be implemented with the following simple patch:
 {code}
 Index: solr/core/src/java/org/apache/solr/core/CoreContainer.java
 ===
 --- solr/core/src/java/org/apache/solr/core/CoreContainer.java
 (revision 1295229)
 +++ solr/core/src/java/org/apache/solr/core/CoreContainer.java
 (working copy)
 @@ -870,6 +870,10 @@
}
  
private String checkDefault(String name) {
 +// if there is only one core, let it be the default without specifying a 
 default in solr.xml
 +if (defaultCoreName.trim().length() == 0  name.trim().length() == 0  
 cores.size() == 1) {
 +  return cores.values().iterator().next().getName();
 +}
  return name.length() == 0  || defaultCoreName.equals(name) || 
 name.trim().length() == 0 ?  : name;
} 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents

2012-03-20 Thread Erick Erickson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233357#comment-13233357
 ] 

Erick Erickson commented on SOLR-445:
-

Well, it's clear I won't get to this in the 3.6 time frame, so if someone else 
wants to pick it up feel free. However, I also wonder whether with 4.0 and 
SolrCloud we have to approach this differently to accomodate how documents are 
passed around there?

 Update Handlers abort with bad documents
 

 Key: SOLR-445
 URL: https://issues.apache.org/jira/browse/SOLR-445
 Project: Solr
  Issue Type: Bug
  Components: update
Affects Versions: 1.3
Reporter: Will Johnson
Assignee: Erick Erickson
 Fix For: 4.0

 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, 
 SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml


 Has anyone run into the problem of handling bad documents / failures mid 
 batch.  Ie:
 add
   doc
 field name=id1/field
   /doc
   doc
 field name=id2/field
 field name=myDateFieldI_AM_A_BAD_DATE/field
   /doc
   doc
 field name=id3/field
   /doc
 /add
 Right now solr adds the first doc and then aborts.  It would seem like it 
 should either fail the entire batch or log a message/return a code and then 
 continue on to add doc 3.  Option 1 would seem to be much harder to 
 accomplish and possibly require more memory while Option 2 would require more 
 information to come back from the API.  I'm about to dig into this but I 
 thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-445) Update Handlers abort with bad documents

2012-03-20 Thread Erick Erickson (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-445:


  Assignee: (was: Erick Erickson)
Issue Type: Improvement  (was: Bug)

 Update Handlers abort with bad documents
 

 Key: SOLR-445
 URL: https://issues.apache.org/jira/browse/SOLR-445
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 1.3
Reporter: Will Johnson
 Fix For: 4.0

 Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, 
 SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml


 Has anyone run into the problem of handling bad documents / failures mid 
 batch.  Ie:
 add
   doc
 field name=id1/field
   /doc
   doc
 field name=id2/field
 field name=myDateFieldI_AM_A_BAD_DATE/field
   /doc
   doc
 field name=id3/field
   /doc
 /add
 Right now solr adds the first doc and then aborts.  It would seem like it 
 should either fail the entire batch or log a message/return a code and then 
 continue on to add doc 3.  Option 1 would seem to be much harder to 
 accomplish and possibly require more memory while Option 2 would require more 
 information to come back from the API.  I'm about to dig into this but I 
 thought I'd ask to see if anyone had any suggestions, thoughts or comments.   
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Using term offsets for hit highlighting

2012-03-20 Thread Erick Erickson
Yep, the first challenge is always getting the old patch(es) to apply.

On Tue, Mar 20, 2012 at 4:09 AM, Alan Woodward
alan.woodw...@romseysoftware.co.uk wrote:
 Thanks for all the offers of help!  It looks as though most of the hard work 
 has already been done, which is exactly where I like to pick up projects.  :-)

 Maybe the best place to start would be for me to rebase the branch against 
 trunk, and see what still fits?  I think there have been some fairly major 
 changes in the internals since July last year.

 On 19 Mar 2012, at 17:07, Mike Sokolov wrote:

 I posted a patch with a Collector somewhat similar to what you described, 
 Alan - it's attached to one of the sub-issues 
 https://issues.apache.org/jira/browse/LUCENE-3318.   It is in a fairly 
 complete alpha state, but has seen no production use of course, since it 
 relies on the remainder of the unfinished work in that branch.  It works by 
 creating a TokenStream based on match positions returned from the query and 
 passing that to the existing Highlighter.  Please feel free to get in touch 
 if you decide to look into that and have questions.


 -Mike

 On 03/19/2012 11:51 AM, Simon Willnauer wrote:
 On Mon, Mar 19, 2012 at 4:50 PM, Uwe Schindleru...@thetaphi.de  wrote:

 Have you marked that for GSOC? Would be a good idea!

  yes I did

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de



 -Original Message-
 From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
 Sent: Monday, March 19, 2012 4:43 PM
 To: dev@lucene.apache.org
 Subject: Re: Using term offsets for hit highlighting

 Alan, you made my day!

 The branch is kind of outdated but I looked at it lately and I can 
 certainly help
 to get it up to speed. The feature in that branch is quite a big one and 
 its in a
 very early stage. Still I want to encourage you to take a look and work 
 on it. I
 promise all my help with the issues!

 let me know if you have questions!

 simon

 On Mon, Mar 19, 2012 at 3:52 PM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:

 Cool, thanks Robert.  I'll take a look at the JIRA ticket.

 On 19 Mar 2012, at 14:44, Robert Muir wrote:


 On Mon, Mar 19, 2012 at 10:38 AM, Alan Woodward
 alan.woodw...@romseysoftware.co.uk  wrote:

 Hello,

 The project I'm currently working on requires the reporting of exact
 hit positions from some pretty hairy queries, not all of which are
 covered by the existing highlighter modules.  I'm working round this
 by translating everything into SpanQueries, and using the getSpans()
 method to locate hits (I've extended the Spans interface to make
 term offsets available - see
 https://issues.apache.org/jira/browse/LUCENE-3826).  This works for
 our use-case, but isn't terribly efficient, and obviously isn't 
 applicable to

 non-Span queries.

 I've seen a bit of chatter on the list about using term offsets to
 provide accurate highlighting in Lucene.  I'm going to have a couple
 of weeks free in April, and I thought I might have a go at
 implementing this.  Mainly I'm wondering if there's already been
 thoughts about how to do it.  My current thoughts are to somehow
 extend the Weight and Scorer interface to make term offsets
 available; to get highlights for a given set of documents, you'd
 essentially run the query again, with a filter on just the documents
 you want highlighted, and have a custom collector that gets the term

 offsets in place of the scores.


 Hi Alan, Simon started some initial work on
 https://issues.apache.org/jira/browse/LUCENE-2878

 Some work and prototypes were done in a branch, but it might be
 lagging behind trunk a bit.

 Additionally at the time it was first done, I think we didn't yet
 support offsets in the postings lists.
 We've since added this and several codecs support it.

 --
 lucidimagination.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional 

[jira] [Updated] (SOLR-3256) Distributed search throws NPE when using fl=score

2012-03-20 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-3256:


Attachment: SOLR-3256.patch

It's rare, it seems to depend on the order of the fl parameters.
http://localhost:8983/solr/select?q=*:*shards=localhost:8983/solrfl=idfl=catfl=price

shows only the id,

http://localhost:8983/solr/select?q=*:*shards=localhost:8983/solrfl=catfl=idfl=price

shows id and cat and
http://localhost:8983/solr/select?q=*:*shards=localhost:8983/solrfl=pricefl=catfl=id
shows price and id.

I'm attaching a patch that demonstrates the failure with a test case.

 Distributed search throws NPE when using fl=score
 -

 Key: SOLR-3256
 URL: https://issues.apache.org/jira/browse/SOLR-3256
 Project: Solr
  Issue Type: Bug
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3256.patch


 Steps to reproduce the problem:
 Start two Solr instances (may use the example configuration)
 add some documents to both instances
 execute a query like: 
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:8984/solrq=(ipod%20OR%20display)*fl=score*
 Expected result:
 List of scores or at least an exception saying that this request is not 
 supported (may not make too much sense to do fl=score, but a descriptive 
 exception can help debug the problem)
 Getting:
 SEVERE: null:java.lang.NullPointerException
   at 
 org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:985)
   at 
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:637)
   at 
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:612)
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
   at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
   at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
   at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
   at org.eclipse.jetty.server.Server.handle(Server.java:351)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
   at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
   at java.lang.Thread.run(Thread.java:636)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: 

[jira] [Created] (SOLR-3257) Dedupe update.chain example should include DistribtedUpdateProcessorFactory

2012-03-20 Thread Markus Jelsma (Created) (JIRA)
Dedupe update.chain example should include DistribtedUpdateProcessorFactory
---

 Key: SOLR-3257
 URL: https://issues.apache.org/jira/browse/SOLR-3257
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
13:55:51 
Reporter: Markus Jelsma
Priority: Trivial
 Fix For: 4.0


Enabling the default dedupe update processor chain breaks distributed indexing 
because DistributedUpdateProcessorFactory is missing in the update chain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3257) Dedupe update.chain example should include DistribtedUpdateProcessorFactory

2012-03-20 Thread Markus Jelsma (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3257:


Attachment: SOLR-3257-4.0-1.patch

Patch for trunk adding the update processor to the chain in solrconfig.

 Dedupe update.chain example should include DistribtedUpdateProcessorFactory
 ---

 Key: SOLR-3257
 URL: https://issues.apache.org/jira/browse/SOLR-3257
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
Priority: Trivial
 Fix For: 4.0

 Attachments: SOLR-3257-4.0-1.patch


 Enabling the default dedupe update processor chain breaks distributed 
 indexing because DistributedUpdateProcessorFactory is missing in the update 
 chain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-3073) Distributed Grouping fails if the uniqueKey is a UUID

2012-03-20 Thread Martijn van Groningen (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen resolved SOLR-3073.
-

Resolution: Fixed

Actual error is fixed, so this issue is resolved.

 Distributed Grouping fails if the uniqueKey is a UUID
 -

 Key: SOLR-3073
 URL: https://issues.apache.org/jira/browse/SOLR-3073
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.5, 4.0
Reporter: Devon Krisman
Assignee: Martijn van Groningen
Priority: Minor
 Fix For: 3.6, 4.0

 Attachments: SOLR-3073-3x.patch, SOLR-3073-3x.patch


 Attempting to use distributed grouping (using a StrField as the 
 group.fieldname) with a UUID as the uniqueKey results in an error because the 
 classname (java.util.UUID) is prepended to the field value during the second 
 phase of the grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3256) Distributed search throws NPE when using fl=score

2012-03-20 Thread Luca Cavanna (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233381#comment-13233381
 ] 

Luca Cavanna commented on SOLR-3256:


Regarding the legacy behavior fl=score which was equals to fl=*,score : it has 
been removed from trunk a few weeks ago (SOLR-2712).

 Distributed search throws NPE when using fl=score
 -

 Key: SOLR-3256
 URL: https://issues.apache.org/jira/browse/SOLR-3256
 Project: Solr
  Issue Type: Bug
Reporter: Tomás Fernández Löbbe
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-3256.patch


 Steps to reproduce the problem:
 Start two Solr instances (may use the example configuration)
 add some documents to both instances
 execute a query like: 
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:8984/solrq=(ipod%20OR%20display)*fl=score*
 Expected result:
 List of scores or at least an exception saying that this request is not 
 supported (may not make too much sense to do fl=score, but a descriptive 
 exception can help debug the problem)
 Getting:
 SEVERE: null:java.lang.NullPointerException
   at 
 org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:985)
   at 
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:637)
   at 
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:612)
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:307)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
   at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
   at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
   at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
   at org.eclipse.jetty.server.Server.handle(Server.java:351)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:634)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
   at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
   at java.lang.Thread.run(Thread.java:636)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2747) Include formatted Changes.html for release

2012-03-20 Thread Martijn van Groningen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated SOLR-2747:


Fix Version/s: (was: 3.6)

Removed 3.6 version.

 Include formatted Changes.html for release
 --

 Key: SOLR-2747
 URL: https://issues.apache.org/jira/browse/SOLR-2747
 Project: Solr
  Issue Type: Improvement
Reporter: Martijn van Groningen
Priority: Minor
 Fix For: 4.0


 Just like when releasing Lucene, Solr should also have a html formatted 
 changes file.
 The Lucene Perl script (lucene/src/site/changes/changes2html.pl) should be 
 reused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2712) Deprecate fl=score behavior.

2012-03-20 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233390#comment-13233390
 ] 

Mark Miller commented on SOLR-2712:
---

There was something missed here. I'll fix it when I fix the multiple fl's not 
being treated right in distrib search - around that same code there is still 
logic that expects this.

 Deprecate fl=score behavior.  
 --

 Key: SOLR-2712
 URL: https://issues.apache.org/jira/browse/SOLR-2712
 Project: Solr
  Issue Type: Task
Reporter: Ryan McKinley
Assignee: Ryan McKinley
 Fix For: 3.6, 4.0


 SOLR-2657 points out that all fields show up when you request score and 
 something becides a 'normal' field.  To support the strange behavior and 
 avoid it when uncenessary we have this:
 {code:java}
 if( fields.size() == 1  _wantsScore  augmenters.size() == 1  
 globs.isEmpty() ) {
   _wantsAllFields = true;
 }
 {code}
 I suggest we advertise in 3.x that expecting *fl=score* to return all fields 
 is deprecated, and remove this bit of crazy code from 4.x

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2725) TieredMergePolicy and expungeDeletes behaviour

2012-03-20 Thread Martijn van Groningen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated SOLR-2725:


Affects Version/s: 3.6
   3.4
   3.5
Fix Version/s: (was: 3.6)

Removed 3.6 from fix versions.

 TieredMergePolicy and expungeDeletes behaviour
 --

 Key: SOLR-2725
 URL: https://issues.apache.org/jira/browse/SOLR-2725
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.3, 3.4, 3.5, 3.6
Reporter: Martijn van Groningen
 Fix For: 4.0


 During executing a commit with expungeDeletes I noticed there were still a 
 lot of segments left.
 However there were still ~30 segments left with deletes after the commit 
 finished.
 After looking in SolrIndexConfig class I noticed that 
 TieredMergePolicy#setExpungeDeletesPctAllowed isn't invoked.
 I think the following statements in SolrIndexConfig#buildMergePolicy method 
 will purge all deletes:
 {code}
 tieredMergePolicy.setExpungeDeletesPctAllowed(0);
 {code} 
 This also reflects the behavior of Solr 3.1 / 3.2
 After some discussion on IRC setting expungeDeletesPctAllowed always to zero 
 isn't best for performance:
 http://colabti.org/irclogger/irclogger_log/lucene-dev?date=2011-08-20#l120
 I think we should add an option to solrconfig.xml that allows users to set 
 this option to whatever value is best for them:
 {code:xml}
 expungeDeletesPctAllowed0/expungeDeletesPctAllowed
 {code}
 Also having a expungeDeletesPctAllowed per commit command would be great:
 {code:xml}
 commit waitFlush=false waitSearcher=false expungeDeletes=true 
 expungeDeletesPctAllowed=0/
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Markus Jelsma (Created) (JIRA)
Ping query caused exception..Invalid version (expected 2, but 60) or the data 
in not in 'javabin' format


 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0


In a test set-up with nodes=2, shards=3 and cores=6 we often see this exception 
in the logs. Once every few ping requests this is thrown, other request return 
a proper OK.

Ping request handler:

{code}
requestHandler name=/admin/ping class=solr.PingRequestHandler
lst name=invariants
  str name=qtselect/str
  str name=q*:*/str
  int name=rows0/int
/lst
lst name=defaults
  str name=wtjson/str
  str name=echoParamsall/str
  bool name=omitHeadertrue/bool
/lst
  /requestHandler
{code}

Exception:

{code}
2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
webapp=/solr path=/admin/ping params={} status=500 QTime=7 
2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - [http-80-18] 
- : null:org.apache.solr.common.SolrException: Ping query caused exception: 
org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
Invalid version (expected 2, but 60) or the data in not in 'javabin' format
at 
org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: 
org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
Invalid version (expected 2, but 60) or the data in not in 'javabin' format
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
at 
org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
... 16 more
Caused by: org.apache.solr.client.solrj.SolrServerException: 
java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in 
not in 'javabin' format
at 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
at 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
... 1 more
Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 60) or 
the data in not in 'javabin' format
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:109)
at 

[jira] [Resolved] (SOLR-2764) Create a NorwegianLightStemmer and NorwegianMinimalStemmer

2012-03-20 Thread Resolved

 [ 
https://issues.apache.org/jira/browse/SOLR-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl resolved SOLR-2764.
---

Resolution: Fixed

Committed to trunk and branch_3x

 Create a NorwegianLightStemmer and NorwegianMinimalStemmer
 --

 Key: SOLR-2764
 URL: https://issues.apache.org/jira/browse/SOLR-2764
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Jan Høydahl
Assignee: Jan Høydahl
 Fix For: 3.6, 4.0

 Attachments: SOLR-2764.patch, SOLR-2764.patch, SOLR-2764.patch, 
 SOLR-2764.patch, SOLR-2764.patch


 We need a simple light-weight stemmer and a minimal stemmer for 
 plural/singlular only in Norwegian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Dawid Weiss (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss updated SOLR-3258:
--

Attachment: debugging.patch

I once tried to debug it but couldn't reproduce. It does happen from time to 
time on my build machine though.

'60' is ASCII for '' so I guess it's something weird emitted. Can you apply 
the attached patch and try to cause this, Markus?

 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at 

[jira] [Created] (SOLR-3259) Solr 4 aesthetics

2012-03-20 Thread Yonik Seeley (Created) (JIRA)
Solr 4 aesthetics
-

 Key: SOLR-3259
 URL: https://issues.apache.org/jira/browse/SOLR-3259
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 4.0


Solr 4 will be a huge new release... we should take this opportunity to improve 
the out-of-the-box experience.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3259) Solr 4 aesthetics

2012-03-20 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233402#comment-13233402
 ] 

Yonik Seeley commented on SOLR-3259:


Some ideas:
 - our fieldType list has grown *huge*... we should probably move the field 
list to the top of the file where it's easier to find
 - the preference for JSON over XML seems to be continuing - we should make 
things more JSON oriented by adding a /query handler that defaults to wt=json 
and perhaps indent=true
 - the concept of an example server that you must configure yourself has 
become less than ideal... perhaps we should just create a server directory 
(but leave things like exampledocs under example)
 - some new JSON based example docs that aren't based on electronics from '05  
(or as an alternative for certain quickstart guides, start off with a curl 
command to add some data rather than trying to shove it all in exampledocs)

 Solr 4 aesthetics
 -

 Key: SOLR-3259
 URL: https://issues.apache.org/jira/browse/SOLR-3259
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 4.0


 Solr 4 will be a huge new release... we should take this opportunity to 
 improve the out-of-the-box experience.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Markus Jelsma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233405#comment-13233405
 ] 

Markus Jelsma commented on SOLR-3258:
-

I've redeployed with your patch. This is very peculiar indeed! The stack trace 
shows a ping failing and at the bottom some that work well. I've also noticed 
the /select handler not being there so i've did a manual request on 
/select?q=*:* and i _sometimes_ get the same error. Some work, some don't.

Does this help a bit?

{code}
2012-03-20 13:45:30,352 INFO [solr.core.SolrCore] - [http-80-17] - : [] 
webapp=/solr path=/admin/ping params={} status=500 QTime=7 
2012-03-20 13:45:30,352 ERROR [solr.servlet.SolrDispatchFilter] - [http-80-17] 
- : null:org.apache.solr.common.SolrException: Ping query caused exception: 
org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
Invalid version (expected 2, but 60) or the data in not in 'javabin' format, 
input: htmlheadtitleApache Tomcat/6.0.35 - Error 
report/titlestyle!--H1 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
 H2 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
 H3 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
 BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} 
B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P 
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
 {color : black;}A.name {color : black;}HR {color : #525D76;}--/style 
/headbodyh1HTTP Status 404 - /solr/select/h1HR size=1 
noshade=noshadepbtype/b Status report/ppbmessage/b 
u/solr/select/u/ppbdescription/b uThe requested resource 
(/solr/select) is not available./u/pHR size=1 
noshade=noshadeh3Apache Tomcat/6.0.35/h3/body/html
at 
org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: 
org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
Invalid version (expected 2, but 60) or the data in not in 'javabin' format, 
input: htmlheadtitleApache Tomcat/6.0.35 - Error 
report/titlestyle!--H1 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
 H2 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
 H3 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
 BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} 
B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P 
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
 {color : black;}A.name {color : black;}HR {color : #525D76;}--/style 
/headbodyh1HTTP Status 404 - /solr/select/h1HR size=1 
noshade=noshadepbtype/b Status report/ppbmessage/b 
u/solr/select/u/ppbdescription/b uThe requested resource 
(/solr/select) is not available./u/pHR size=1 
noshade=noshadeh3Apache Tomcat/6.0.35/h3/body/html
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
at 

[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233408#comment-13233408
 ] 

Dawid Weiss commented on SOLR-3258:
---

And here comes the moment where my knowledge of Solr ends :) I'd say there is 
definitely a bug in improper handling of HTTP response status (and this should 
be fixed), unless there is a filter somewhere that emits this HTML and fakes 
HTTP 200... But as for the cause of why this happens in general -- no idea.

 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 

Re: [JENKINS] Lucene-Solr-tests-only-trunk-java7 - Build # 2022 - Failure

2012-03-20 Thread Michael McCandless
I opened https://issues.apache.org/jira/browse/LUCENE-3890 for this...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Mar 19, 2012 at 8:23 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2022/

 1 tests failed.
 REGRESSION:  
 org.apache.lucene.search.grouping.GroupFacetCollectorTest.testRandom

 Error Message:
 null

 Stack Trace:
 java.lang.NullPointerException
        at 
 org.apache.lucene.search.grouping.term.TermGroupFacetCollector$MV.setNextReader(TermGroupFacetCollector.java:249)
        at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:505)
        at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
        at 
 org.apache.lucene.search.grouping.GroupFacetCollectorTest.testRandom(GroupFacetCollectorTest.java:259)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
        at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
        at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
        at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
        at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
        at 
 org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:729)
        at 
 org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:645)
        at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
        at 
 org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:556)
        at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
        at 
 org.apache.lucene.util.LuceneTestCase$RememberThreadRule$1.evaluate(LuceneTestCase.java:618)
        at org.junit.rules.RunRules.evaluate(RunRules.java:18)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
        at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
        at 
 org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
        at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
        at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
        at 
 org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
        at 
 org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
        at 
 org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:51)
        at 
 org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:21)
        at 
 org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:22)
        at org.junit.rules.RunRules.evaluate(RunRules.java:18)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
        at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
        at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
        at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
        at 
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)




 Build Log (for compile errors):
 [...truncated 5557 lines...]




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3890) GroupFacetCollectorTest nightly build failure

2012-03-20 Thread Michael McCandless (Created) (JIRA)
GroupFacetCollectorTest nightly build failure
-

 Key: LUCENE-3890
 URL: https://issues.apache.org/jira/browse/LUCENE-3890
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 4.0


Failure from nightly build:

https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2022/testReport/junit/org.apache.lucene.search.grouping/GroupFacetCollectorTest/testRandom/

It reproduces for me with:
{noformat}
 ant test -Dtestcase=GroupFacetCollectorTest -Dtestmethod=testRandom 
-Dtests.seed=7d227aa075b7bfb8:550d2a0828ce2537:-3553c99f6a4d293e 
-Dtests.multiplier=3 -Dargs=-Dfile.encoding=US-ASCII
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Markus Jelsma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233412#comment-13233412
 ] 

Markus Jelsma commented on SOLR-3258:
-

Nasty!

With this issue indexing fails, although some documents seem to be added. 
Custom request handlers still work but the default /select handler gives the 
trouble, which is used by our ping handler. Manual requests to 
/select?distrib=false do work without trouble.

I also know that this happens with an empty index.

I'd love to provide more details but i haven't. For now the issue is here but 
it just might disappear as suddenly as it appeared.

 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at 

[jira] [Updated] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Markus Jelsma (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markus Jelsma updated SOLR-3258:


Attachment: zkdump.txt

 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch, zkdump.txt


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 

[jira] [Commented] (SOLR-3259) Solr 4 aesthetics

2012-03-20 Thread Commented

[ 
https://issues.apache.org/jira/browse/SOLR-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233414#comment-13233414
 ] 

Jan Høydahl commented on SOLR-3259:
---

+1 to the general idea of lifting the first-time experience of Solr. I like all 
your proposals except...

I'm not sure if we gain much by moving the example to a server folder. I 
think it's a Good Thing™ that we make it clear that what's provided is just an 
example, not for production. Another name for the example folder could be 
jetty, because that's what it really is - which many are confused by today, 
they think that the lib and etc folders below example belong to Solr...

If anything I'd vote for making the distro closer to what people would want in 
production. You could then have a pure solr/jetty folder with ONLY jetty, a 
solr/example-home folder which holds todays example/solr making it more 
obvious what folder is actually the SOLR_HOME, and finally a start script on 
top level, start-solr.[cmd|sh], which copies the war from dist to 
jetty/webapps, sets -Dsolr.solr.home and starts Jetty. By default start-solr.sh 
would log to stdout, but a param could have it log to file.

 Solr 4 aesthetics
 -

 Key: SOLR-3259
 URL: https://issues.apache.org/jira/browse/SOLR-3259
 Project: Solr
  Issue Type: New Feature
Reporter: Yonik Seeley
 Fix For: 4.0


 Solr 4 will be a huge new release... we should take this opportunity to 
 improve the out-of-the-box experience.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3890) GroupFacetCollectorTest nightly build failure

2012-03-20 Thread Martijn van Groningen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233415#comment-13233415
 ] 

Martijn van Groningen commented on LUCENE-3890:
---

Thanks for noticing this! I'll take a look at it.

 GroupFacetCollectorTest nightly build failure
 -

 Key: LUCENE-3890
 URL: https://issues.apache.org/jira/browse/LUCENE-3890
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
 Fix For: 4.0


 Failure from nightly build:
 https://builds.apache.org/job/Lucene-Solr-tests-only-trunk-java7/2022/testReport/junit/org.apache.lucene.search.grouping/GroupFacetCollectorTest/testRandom/
 It reproduces for me with:
 {noformat}
  ant test -Dtestcase=GroupFacetCollectorTest -Dtestmethod=testRandom 
 -Dtests.seed=7d227aa075b7bfb8:550d2a0828ce2537:-3553c99f6a4d293e 
 -Dtests.multiplier=3 -Dargs=-Dfile.encoding=US-ASCII
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3846) Fuzzy suggester

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3846:
---

Fix Version/s: (was: 3.6)

 Fuzzy suggester
 ---

 Key: LUCENE-3846
 URL: https://issues.apache.org/jira/browse/LUCENE-3846
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3846.patch, LUCENE-3846.patch


 Would be nice to have a suggester that can handle some fuzziness (like spell 
 correction) so that it's able to suggest completions that are near what you 
 typed.
 As a first go at this, I implemented 1T (ie up to 1 edit, including a 
 transposition), except the first letter must be correct.
 But there is a penalty, ie, the corrected suggestion needs to have a much 
 higher freq than the exact match suggestion before it can compete.
 Still tons of nocommits, and somehow we should merge this / make it work with 
 analyzing suggester too (LUCENE-3842).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3564) rename IndexWriter.rollback to .rollbackAndClose

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3564:
---

Fix Version/s: (was: 3.6)

 rename IndexWriter.rollback to .rollbackAndClose
 

 Key: LUCENE-3564
 URL: https://issues.apache.org/jira/browse/LUCENE-3564
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0


 Spinoff from LUCENE-3454, where Shai noticed that rollback is trappy since it 
 [unexpected] closes the IW.
 I think we should rename it to rollbackAndClose.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Markus Jelsma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233420#comment-13233420
 ] 

Markus Jelsma commented on SOLR-3258:
-

I suspected Solr's distributed capabilities because of the error occuring with 
distrib=true. So i stopped Zookeeper, removed the data directory and restarted 
Zookeeper and the Solr nodes. I attached a zookeeper dump i took just moments 
before removing the data directory.



 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch, zkdump.txt


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 

[jira] [Updated] (LUCENE-2686) DisjunctionSumScorer should not call .score on sub scorers until consumer calls .score

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2686:
---

Fix Version/s: (was: 3.6)

 DisjunctionSumScorer should not call .score on sub scorers until consumer 
 calls .score
 --

 Key: LUCENE-2686
 URL: https://issues.apache.org/jira/browse/LUCENE-2686
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-2686.patch, LUCENE-2686.patch, 
 Test2LUCENE2590.java


 Spinoff from java-user thread question about Scorer.freq() from Koji...
 BooleanScorer2 uses DisjunctionSumScorer to score only-SHOULD-clause boolean 
 queries.
 But, this scorer does too much work for collectors that never call .score, 
 because it scores while it's matching.  It should only call .score on the 
 subs when the caller calls its .score.
 This also has the side effect of messing up advanced collectors that gather 
 the freq() of the subs (using LUCENE-2590).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3220) RecoveryZkTest test failure

2012-03-20 Thread Markus Jelsma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233423#comment-13233423
 ] 

Markus Jelsma commented on SOLR-3220:
-

In SOLR-3258 i included a zkdump file _while_ this issue was occurring. The 
problem vanished after removing the Zookeeper data directory and restarting. So 
i hope someone can find useful information in the dump file.

 RecoveryZkTest test failure
 ---

 Key: SOLR-3220
 URL: https://issues.apache.org/jira/browse/SOLR-3220
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
 Attachments: TEST-org.apache.solr.cloud.RecoveryZkTest.xml


 observed a failure in RecoveryZkTest.testDistribSearch using r1298661 that 
 had some odd looking (to me) log info. 
 could not reproduce with identical seed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233426#comment-13233426
 ] 

Yonik Seeley commented on SOLR-3258:


bq. I suspected Solr's distributed capabilities because of the error occuring 
with distrib=true.

I was going to ask... do you mean for the ping query to be distributed?

 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch, zkdump.txt


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 

[jira] [Commented] (SOLR-3220) RecoveryZkTest test failure

2012-03-20 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233431#comment-13233431
 ] 

Mark Miller commented on SOLR-3220:
---

Sorry - missed these issues going by - been busy with other things for a bit.

Yeah, I've seen this before. It happens when an error is returned rather than a 
java bin response. I've seen it with 404's, sure it happens with other errors 
at the container level. The '60' is the start of the html (if I remember right) 
error response.

It makes debugging a bitch sometimes. 

So for instance, for the ping handler, perhaps it wasn't found, or it errored. 
For this, it could be an issue during startup or shutdown when a 404 can be 
returned.

We should make a new issue for the problem - offhand I don't have a solution 
though. Adding structure error support to Solr might help.

For this issue I first need to see if that exception even relates to the 
failure - there is a good chance it does not. A server is stopped and started 
in this test, and a query or update at the wrong time can return a 404 or some 
other non success code. So you are likely to see this exception even if the 
test passes.

 RecoveryZkTest test failure
 ---

 Key: SOLR-3220
 URL: https://issues.apache.org/jira/browse/SOLR-3220
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
 Attachments: TEST-org.apache.solr.cloud.RecoveryZkTest.xml


 observed a failure in RecoveryZkTest.testDistribSearch using r1298661 that 
 had some odd looking (to me) log info. 
 could not reproduce with identical seed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2020) HttpComponentsSolrServer

2012-03-20 Thread Sami Siren (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233434#comment-13233434
 ] 

Sami Siren commented on SOLR-2020:
--

bq. I assume this means we'll be able to switch to using NIO for the 
distributed search sub-requests!

Yeah, that should be possible.

 HttpComponentsSolrServer
 

 Key: SOLR-2020
 URL: https://issues.apache.org/jira/browse/SOLR-2020
 Project: Solr
  Issue Type: New Feature
  Components: clients - java
Affects Versions: 1.4.1
 Environment: Any
Reporter: Chantal Ackermann
Priority: Minor
 Fix For: 4.0

 Attachments: HttpComponentsSolrServer.java, 
 HttpComponentsSolrServerTest.java, SOLR-2020-HttpSolrServer.patch, 
 SOLR-2020.patch, SOLR-2020.patch


 Implementation of SolrServer that uses the Apache Http Components framework.
 Http Components (http://hc.apache.org/) is the successor of Commons 
 HttpClient and thus HttpComponentsSolrServer would be a successor of 
 CommonsHttpSolrServer, in the future.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Markus Jelsma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233436#comment-13233436
 ] 

Markus Jelsma commented on SOLR-3258:
-

It seems it is. The ping query is just a /select?q=*:*rows=0 but it yields 
different results with distrib=false specified.

 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch, zkdump.txt


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at 

[jira] [Commented] (SOLR-3220) RecoveryZkTest test failure

2012-03-20 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233441#comment-13233441
 ] 

Mark Miller commented on SOLR-3220:
---

Also, as this and a couple other classes are expected to throw various nasty 
exceptions, I had them all ignored in the base class - but yonik unignored at 
some point when he was debugging. I think we should turn that ignore back on - 
the ant test output is a mess otherwise.

 RecoveryZkTest test failure
 ---

 Key: SOLR-3220
 URL: https://issues.apache.org/jira/browse/SOLR-3220
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
 Attachments: TEST-org.apache.solr.cloud.RecoveryZkTest.xml


 observed a failure in RecoveryZkTest.testDistribSearch using r1298661 that 
 had some odd looking (to me) log info. 
 could not reproduce with identical seed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1052) Deprecate/Remove indexDefaults and mainIndex in favor of indexConfig in solrconfig.xml

2012-03-20 Thread Commented

[ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233446#comment-13233446
 ] 

Jan Høydahl commented on SOLR-1052:
---

Last call before commit to branch_3x. Speak now or be forever silent :)

 Deprecate/Remove indexDefaults and mainIndex in favor of indexConfig in 
 solrconfig.xml
 

 Key: SOLR-1052
 URL: https://issues.apache.org/jira/browse/SOLR-1052
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Jan Høydahl
  Labels: solrconfig.xml
 Fix For: 3.6, 4.0

 Attachments: SOLR-1052-3x.patch, SOLR-1052-3x.patch, 
 SOLR-1052-3x.patch, SOLR-1052-3x.patch


 Given that we now handle multiple cores via the solr.xml and the discussion 
 around indexDefaults and mainIndex at 
 http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
 We should deprecate old indexDefaults and mainIndex sections and only use 
 a new indexConfig section.
 3.6: Deprecation warning if old section used
 4.0: If LuceneMatchVersion before LUCENE_40 then warn (so old configs will 
 work), else fail fast

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-3259) Solr 4 aesthetics

2012-03-20 Thread Bill Bell
+1 to all folder suggestions

Bill Bell
Sent from mobile


On Mar 20, 2012, at 8:07 AM, Jan Høydahl (Commented) (JIRA)j...@apache.org 
wrote:

 
[ 
 https://issues.apache.org/jira/browse/SOLR-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233414#comment-13233414
  ] 
 
 Jan Høydahl commented on SOLR-3259:
 ---
 
 +1 to the general idea of lifting the first-time experience of Solr. I like 
 all your proposals except...
 
 I'm not sure if we gain much by moving the example to a server folder. I 
 think it's a Good Thing™ that we make it clear that what's provided is just 
 an example, not for production. Another name for the example folder could be 
 jetty, because that's what it really is - which many are confused by today, 
 they think that the lib and etc folders below example belong to Solr...
 
 If anything I'd vote for making the distro closer to what people would want 
 in production. You could then have a pure solr/jetty folder with ONLY 
 jetty, a solr/example-home folder which holds todays example/solr making 
 it more obvious what folder is actually the SOLR_HOME, and finally a start 
 script on top level, start-solr.[cmd|sh], which copies the war from dist to 
 jetty/webapps, sets -Dsolr.solr.home and starts Jetty. By default 
 start-solr.sh would log to stdout, but a param could have it log to file.
 
 Solr 4 aesthetics
 -
 
Key: SOLR-3259
URL: https://issues.apache.org/jira/browse/SOLR-3259
Project: Solr
 Issue Type: New Feature
   Reporter: Yonik Seeley
Fix For: 4.0
 
 
 Solr 4 will be a huge new release... we should take this opportunity to 
 improve the out-of-the-box experience.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3830) MappingCharFilter could be improved by switching to an FST.

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3830:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 MappingCharFilter could be improved by switching to an FST.
 ---

 Key: LUCENE-3830
 URL: https://issues.apache.org/jira/browse/LUCENE-3830
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Dawid Weiss
Assignee: Dawid Weiss
Priority: Minor
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0


 MappingCharFilter stores an overly complex tree-like structure for matching 
 input patterns. The input is a union of fixed strings mapped to a set of 
 fixed strings; an fst matcher would be ideal here and provide both memory and 
 speed improvement I bet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-3891) Documents loaded at search time (IndexReader.document) should be a different class from the index-time Document

2012-03-20 Thread Michael McCandless (Created) (JIRA)
Documents loaded at search time (IndexReader.document) should be a different 
class from the index-time Document
---

 Key: LUCENE-3891
 URL: https://issues.apache.org/jira/browse/LUCENE-3891
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless


The fact that the Document you can load at search time is the same Document 
class you had indexed is horribly trappy in Lucene, because, the loaded 
document necessarily loses information like field boost, whether a field was 
tokenized, etc.  (See LUCENE-3854 for a recent example).

We should fix this, statically, so that it's an entirely different class at 
search time vs index time.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3891) Documents loaded at search time (IndexReader.document) should be a different class from the index-time Document

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3891:
---

Fix Version/s: 4.0
   Labels: gsoc2012 lucene-gsoc-12  (was: )

 Documents loaded at search time (IndexReader.document) should be a different 
 class from the index-time Document
 ---

 Key: LUCENE-3891
 URL: https://issues.apache.org/jira/browse/LUCENE-3891
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0


 The fact that the Document you can load at search time is the same Document 
 class you had indexed is horribly trappy in Lucene, because, the loaded 
 document necessarily loses information like field boost, whether a field was 
 tokenized, etc.  (See LUCENE-3854 for a recent example).
 We should fix this, statically, so that it's an entirely different class at 
 search time vs index time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3258) Ping query caused exception..Invalid version (expected 2, but 60) or the data in not in 'javabin' format

2012-03-20 Thread Markus Jelsma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233451#comment-13233451
 ] 

Markus Jelsma commented on SOLR-3258:
-

It seems i found the problem. The solr.xml file on one of the nodes received a 
typo. Instead shard=shard1 i had shard_a=shard1. It's pretty hard to 
reproduce but after removing the ZK data directories you can start the nodes 
with one core having a bad shard parameter. Originally only one node had a 
corrupt solr.xml file but i could only reproduce by corrupting the file on both 
nodes and starting Solr.



 Ping query caused exception..Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 

 Key: SOLR-3258
 URL: https://issues.apache.org/jira/browse/SOLR-3258
 Project: Solr
  Issue Type: Bug
 Environment: solr-impl 4.0-SNAPSHOT 1302403 - markus - 2012-03-19 
 13:55:51 
Reporter: Markus Jelsma
 Fix For: 4.0

 Attachments: debugging.patch, zkdump.txt


 In a test set-up with nodes=2, shards=3 and cores=6 we often see this 
 exception in the logs. Once every few ping requests this is thrown, other 
 request return a proper OK.
 Ping request handler:
 {code}
 requestHandler name=/admin/ping class=solr.PingRequestHandler
 lst name=invariants
   str name=qtselect/str
   str name=q*:*/str
   int name=rows0/int
 /lst
 lst name=defaults
   str name=wtjson/str
   str name=echoParamsall/str
   bool name=omitHeadertrue/bool
 /lst
   /requestHandler
 {code}
 Exception:
 {code}
 2012-03-20 13:16:06,405 INFO [solr.core.SolrCore] - [http-80-18] - : [core_a] 
 webapp=/solr path=/admin/ping params={} status=500 QTime=7 
 2012-03-20 13:16:06,406 ERROR [solr.servlet.SolrDispatchFilter] - 
 [http-80-18] - : null:org.apache.solr.common.SolrException: Ping query caused 
 exception: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:77)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.solr.common.SolrException: 
 org.apache.solr.client.solrj.SolrServerException: java.lang.RuntimeException: 
 Invalid version (expected 2, but 60) or the data in not in 'javabin' format
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
 at 
 org.apache.solr.handler.PingRequestHandler.handleRequestBody(PingRequestHandler.java:68)
 ... 16 more
 Caused by: org.apache.solr.client.solrj.SolrServerException: 
 java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data 
 in not in 'javabin' format
 at 
 org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:278)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:158)
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:123)
 at 

[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field

2012-03-20 Thread Antoine Le Floc'h (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233453#comment-13233453
 ] 

Antoine Le Floc'h commented on SOLR-2242:
-

Bill,

Just a thought, how are you going to plug in 
[SOLR-3134|https://issues.apache.org/jira/browse/SOLR-3134] then ?
Since we are not able to aggregate distinct count over shards, shouldn't you do 
something like:
{code}
lst name=facet_numTerms
  lst name=localhost:/solr
int name=cat15/int
int name=price14/int
  /lst
  lst name=localhost:/solr
int name=cat3/int
int name=price23/int
  /lst
/lst
{code}


 Get distinct count of names for a facet field
 -

 Key: SOLR-2242
 URL: https://issues.apache.org/jira/browse/SOLR-2242
 Project: Solr
  Issue Type: New Feature
  Components: Response Writers
Affects Versions: 4.0
Reporter: Bill Bell
Priority: Minor
 Fix For: 4.0

 Attachments: NumFacetTermsFacetsTest.java, 
 SOLR-2242-notworkingtest.patch, SOLR-2242-solr40.patch, SOLR-2242.patch, 
 SOLR-2242.patch, SOLR-2242.patch, SOLR-2242.shard.patch, 
 SOLR-2242.shard.patch, SOLR-2242.shard.withtests.patch, 
 SOLR-2242.solr3.1-fix.patch, SOLR-2242.solr3.1.patch, 
 SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch


 When returning facet.field=name of field you will get a list of matches for 
 distinct values. This is normal behavior. This patch tells you how many 
 distinct values you have (# of rows). Use with limit=-1 and mincount=1.
 The feature is called namedistinct. Here is an example:
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=2facet.limit=-1facet.field=price
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=0facet.limit=-1facet.field=price
 http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solrindent=trueq=*:*facet=truefacet.mincount=1facet.numFacetTerms=1facet.limit=-1facet.field=price
 This currently only works on facet.field.
 {code}
 lst name=facet_fields
   lst name=price
 int name=numFacetTerms14/int
 int name=0.03/intint name=11.51/intint 
 name=19.951/intint name=74.991/intint name=92.01/intint 
 name=179.991/intint name=185.01/intint name=279.951/intint 
 name=329.951/intint name=350.01/intint name=399.01/intint 
 name=479.951/intint name=649.991/intint name=2199.01/int
   /lst
 /lst
 {code} 
 Several people use this to get the group.field count (the # of groups).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3729) Allow using FST to hold terms data in DocValues.BYTES_*_SORTED

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3729:
---

Labels: gsoc2012 lucene-gsoc-11  (was: )

 Allow using FST to hold terms data in DocValues.BYTES_*_SORTED
 --

 Key: LUCENE-3729
 URL: https://issues.apache.org/jira/browse/LUCENE-3729
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
  Labels: gsoc2012, lucene-gsoc-11
 Attachments: LUCENE-3729.patch, LUCENE-3729.patch, LUCENE-3729.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3220) RecoveryZkTest test failure

2012-03-20 Thread Markus Jelsma (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233455#comment-13233455
 ] 

Markus Jelsma commented on SOLR-3220:
-

In my case i had a typo in a solr.xml file on one node: shard_a=shard1 was 
specified for one of three cores. Fixing and removing ZK data directories 
solved the issue.

 RecoveryZkTest test failure
 ---

 Key: SOLR-3220
 URL: https://issues.apache.org/jira/browse/SOLR-3220
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
 Attachments: TEST-org.apache.solr.cloud.RecoveryZkTest.xml


 observed a failure in RecoveryZkTest.testDistribSearch using r1298661 that 
 had some odd looking (to me) log info. 
 could not reproduce with identical seed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Maven artifacts not working?

2012-03-20 Thread Jason Rutherglen
This link seems to not work:

https://builds.apache.org/job/Lucene-Solr-Maven-trunk/lastSuccessfulBuild/artifact/maven_artifacts

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3514) deep paging with Sort

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3514:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 deep paging with Sort
 -

 Key: LUCENE-3514
 URL: https://issues.apache.org/jira/browse/LUCENE-3514
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.4, 4.0
Reporter: Robert Muir
  Labels: gsoc2012, lucene-gsoc-12

 We added IS.searchAfter(Query, Filter) but we don't support Sort yet with 
 this API.
 I think it might be overkill at least at first to try to implement 12 
 collector variants for this.
 I put the following idea on SOLR-1726:
 One idea would be to start with one or two implementations (maybe in/out of 
 order) for the sorting case, and dont overspecialize it yet.
 * for page 1, the ScoreDoc (FieldDoc really) will be null, so we just return 
 the normal impl anyway.
 * even if our searchAfter isnt huper-duper fast, the user can always make the 
 tradeoff like with page-by-score. they can always just pass null until like 
 page 10 or something if they compute that it only starts to 'help' then.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3475) ShingleFilter should handle positionIncrement of zero, e.g. synonyms

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3475:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

I think this is important, now that we have graph analyzers (like Kuromoji).

So ShingleFilter should pay attention to posInc as well as posLength...

 ShingleFilter should handle positionIncrement of zero, e.g. synonyms
 

 Key: LUCENE-3475
 URL: https://issues.apache.org/jira/browse/LUCENE-3475
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Affects Versions: 3.4
Reporter: Cameron
Priority: Minor
  Labels: gsoc2012, lucene-gsoc-12

 ShingleFilter is creating shingles for a single term that has been expanded 
 by synonyms when it shouldn't. The position increment is 0.
 As an example, I have an Analyzer with a SynonymFilter followed by a 
 ShingleFilter. Assuming car and auto are synonyms, the SynonymFilter produces 
 two tokens and position 1: car, auto. The ShingleFilter is then producing 3 
 tokens, when there should only be two: car, car auto, auto. This behavior 
 seems incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3214) If you use multiple fl entries rather than a comma separated list, all but the first entry can be ignored if you are using distributed search.

2012-03-20 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233461#comment-13233461
 ] 

Mark Miller commented on SOLR-3214:
---

bq. It appears that currently, score is synonymous with *,score is just 
not true currently

This was recently changed by SOLR-2712 - this part of it just was missed.

 If you use multiple fl entries rather than a comma separated list, all but 
 the first entry can be ignored if you are using distributed search.
 --

 Key: SOLR-3214
 URL: https://issues.apache.org/jira/browse/SOLR-3214
 Project: Solr
  Issue Type: Bug
  Components: search
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.0


 I have not checked yet, but prob in 3.x too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3422) IndeIndexWriter.optimize() throws FileNotFoundException and IOException

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-3422.


Resolution: Incomplete

 IndeIndexWriter.optimize() throws FileNotFoundException and IOException
 ---

 Key: LUCENE-3422
 URL: https://issues.apache.org/jira/browse/LUCENE-3422
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Elizabeth Nisha

 I am using lucene 3.0.2 search APIs for my application. 
 Indexed data is about 350MB and time taken for indexing is 25 hrs. Search 
 indexing and Optimization runs in two different threads. Optimization runs 
 for every 1 hour and it doesn't run while indexing is going on and vice 
 versa. When optimization is going on using IndexWriter.optimize(), 
 FileNotFoundException and IOException are seen in my log and the index file 
 is getting corrupted, log says
 1. java.io.IOException: No sub-file with id _5r8.fdt found 
 [The file name in this message changes over time (_5r8.fdt, _6fa.fdt, 
 _6uh.fdt, ..., _emv.fdt) ]
 2. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_bdx.cfs (No such file or directory)  
 3. java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
   Stack trace: java.io.IOException: background merge hit exception: 
 _hkp:c100-_hkp _hkq:c100-_hkp _hkr:c100-_hkr _hks:c100-_hkr _hxb:c5500 
 _hx5:c1000 _hxc:c198
 84 into _hxd [optimize] [mergeDocStores]
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2359)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2298)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2268)
at com.telelogic.cs.search.SearchIndex.doOptimize(SearchIndex.java:130)
at 
 com.telelogic.cs.search.SearchIndexerThread$1.run(SearchIndexerThread.java:337)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.FileNotFoundException: 
 /local/groups/necim/index_5.3/index/_hkq.cfs (No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:212)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.init(SimpleFSDirectory.java:76)
at 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.init(SimpleFSDirectory.java:97)
at 
 org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.init(NIOFSDirectory.java:87)
at 
 org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:67)
at 
 org.apache.lucene.index.CompoundFileReader.init(CompoundFileReader.java:67)
at 
 org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:114)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:590)
at 
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:616)
at 
 org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4309)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3965)
at 
 org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:231)
at 
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:288)
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3883) Analysis for Irish

2012-03-20 Thread Jim Regan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233465#comment-13233465
 ] 

Jim Regan commented on LUCENE-3883:
---

Wow! Thanks Robert!

There isn't usually a hyphen with 'h' before a vowel, but I've started to see 
it recently -- there are no native Irish words beginning with 'h', so it used 
to be relatively unambiguous that a 'h' was a mutation, but with an increase of 
scientific literature in Irish, there are more Greek and Latin loan words being 
added which do begin with 'h', so it's no longer clear.

 Analysis for Irish
 --

 Key: LUCENE-3883
 URL: https://issues.apache.org/jira/browse/LUCENE-3883
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Jim Regan
Priority: Trivial
  Labels: analysis, newbie
 Attachments: LUCENE-3883.patch, LUCENE-3883.patch, irish.sbl


 Adds analysis for Irish.
 The stemmer is generated from a snowball stemmer. I've sent it to Martin 
 Porter, who says it will be added during the week.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3333) Specialize DisjunctionScorer if all clauses are TermQueries

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 Specialize DisjunctionScorer if all clauses are TermQueries
 ---

 Key: LUCENE-
 URL: https://issues.apache.org/jira/browse/LUCENE-
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0


 spinnoff from LUCENE-3328 - since we have a specialized conjunction scorer we 
 should also investigate if this pays off in disjunction scoring

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3272) Consolidate Lucene's QueryParsers into a module

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-3272.


   Resolution: Fixed
Fix Version/s: 4.0

 Consolidate Lucene's QueryParsers into a module
 ---

 Key: LUCENE-3272
 URL: https://issues.apache.org/jira/browse/LUCENE-3272
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/queryparser
Reporter: Chris Male
 Fix For: 4.0


 Lucene has a lot of QueryParsers and we should have them all in a single 
 consistent place.  
 The following are QueryParsers I can find that warrant moving to the new 
 module:
 - Lucene Core's QueryParser
 - AnalyzingQueryParser
 - ComplexPhraseQueryParser
 - ExtendableQueryParser
 - Surround's QueryParser
 - PrecedenceQueryParser
 - StandardQueryParser
 - XML-Query-Parser's CoreParser
 All seem to do a good job at their kind of parsing with extensive tests.
 One challenge of consolidating these is that many tests use Lucene Core's 
 QueryParser.  One option is to just replicate this class in src/test and call 
 it TestingQueryParser.  Another option is to convert all tests over to 
 programmatically building their queries (seems like alot of work).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3312) Break out StorableField from IndexableField

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3312:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 Break out StorableField from IndexableField
 ---

 Key: LUCENE-3312
 URL: https://issues.apache.org/jira/browse/LUCENE-3312
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: Field Type branch


 In the field type branch we have strongly decoupled
 Document/Field/FieldType impl from the indexer, by having only a
 narrow API (IndexableField) passed to IndexWriter.  This frees apps up
 use their own documents instead of the user-space impls we provide
 in oal.document.
 Similarly, with LUCENE-3309, we've done the same thing on the
 doc/field retrieval side (from IndexReader), with the
 StoredFieldsVisitor.
 But, maybe we should break out StorableField from IndexableField,
 such that when you index a doc you provide two Iterables -- one for the
 IndexableFields and one for the StorableFields.  Either can be null.
 One downside is possible perf hit for fields that are both indexed 
 stored (ie, we visit them twice, lookup their name in a hash twice,
 etc.).  But the upside is a cleaner separation of concerns in API

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3883) Analysis for Irish

2012-03-20 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233470#comment-13233470
 ] 

Robert Muir commented on LUCENE-3883:
-

Thanks Jim. Personally I think this patch is ready to be committed. 

I'm just going to wait a bit in case you get any feedback from Martin or other 
snowball developers,
but I won't wait too long :) 

 Analysis for Irish
 --

 Key: LUCENE-3883
 URL: https://issues.apache.org/jira/browse/LUCENE-3883
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Jim Regan
Priority: Trivial
  Labels: analysis, newbie
 Attachments: LUCENE-3883.patch, LUCENE-3883.patch, irish.sbl


 Adds analysis for Irish.
 The stemmer is generated from a snowball stemmer. I've sent it to Martin 
 Porter, who says it will be added during the week.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-3883) Analysis for Irish

2012-03-20 Thread Robert Muir (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir reassigned LUCENE-3883:
---

Assignee: Robert Muir

 Analysis for Irish
 --

 Key: LUCENE-3883
 URL: https://issues.apache.org/jira/browse/LUCENE-3883
 Project: Lucene - Java
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Jim Regan
Assignee: Robert Muir
Priority: Trivial
  Labels: analysis, newbie
 Attachments: LUCENE-3883.patch, LUCENE-3883.patch, irish.sbl


 Adds analysis for Irish.
 The stemmer is generated from a snowball stemmer. I've sent it to Martin 
 Porter, who says it will be added during the week.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3178) Native MMapDir

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3178:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 Native MMapDir
 --

 Key: LUCENE-3178
 URL: https://issues.apache.org/jira/browse/LUCENE-3178
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/store
Reporter: Michael McCandless
  Labels: gsoc2012, lucene-gsoc-12

 Spinoff from LUCENE-2793.
 Just like we will create native Dir impl (UnixDirectory) to pass the right OS 
 level IO flags depending on the IOContext, we could in theory do something 
 similar with MMapDir.
 The problem is MMap is apparently quite hairy... and to pass the flags the 
 native code would need to invoke mmap (I think?), unlike UnixDir where the 
 code only has to open the file handle.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-3177) Decouple indexer from Document/Field impls

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-3177.


Resolution: Fixed

 Decouple indexer from Document/Field impls
 --

 Key: LUCENE-3177
 URL: https://issues.apache.org/jira/browse/LUCENE-3177
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3177.patch, LUCENE-3177.patch


 I think we should define minimal iterator interfaces,
 IndexableDocument/Field, that indexer requires to index documents.
 Indexer would consume only these bare minimum interfaces, not the
 concrete Document/Field/FieldType classes from oal.document package.
 Then, the Document/Field/FieldType hierarchy is one concrete impl of
 these interfaces. Apps are free to make their own impls as well.
 Maybe eventually we make another impl that enforces a global schema,
 eg factored out of Solr's impl.
 I think this frees design pressure on our Document/Field/FieldType
 hierarchy, ie, these classes are free to become concrete
 fully-featured user-space classes with all sorts of friendly sugar
 APIs for adding/removing fields, getting/setting values, types, etc.,
 but they don't need substantial extensibility/hierarchy. Ie, the
 extensibility point shifts to IndexableDocument/Field interface.
 I think this means we can collapse the three classes we now have for a
 Field (Fieldable/AbstracField/Field) down to a single concrete class
 (well, except for LUCENE-2308 where we want to break out dedicated
 classes for different field types...).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2764) Create a NorwegianLightStemmer and NorwegianMinimalStemmer

2012-03-20 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233472#comment-13233472
 ] 

Robert Muir commented on SOLR-2764:
---

Very nice work Jan!

 Create a NorwegianLightStemmer and NorwegianMinimalStemmer
 --

 Key: SOLR-2764
 URL: https://issues.apache.org/jira/browse/SOLR-2764
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Reporter: Jan Høydahl
Assignee: Jan Høydahl
 Fix For: 3.6, 4.0

 Attachments: SOLR-2764.patch, SOLR-2764.patch, SOLR-2764.patch, 
 SOLR-2764.patch, SOLR-2764.patch


 We need a simple light-weight stemmer and a minimal stemmer for 
 plural/singlular only in Norwegian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3122) Cascaded grouping

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3122:
---

Fix Version/s: (was: 3.6)
   Labels: gsoc2012 lucene-gsoc-12  (was: )

 Cascaded grouping
 -

 Key: LUCENE-3122
 URL: https://issues.apache.org/jira/browse/LUCENE-3122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: modules/grouping
Reporter: Michael McCandless
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0


 Similar to SOLR-2526, in that you are grouping on 2 separate fields, but 
 instead of treating those fields as a single grouping by a compound key, this 
 change would let you first group on key1 for the primary groups and then 
 secondarily on key2 within the primary groups.
 Ie, the result you get back would have groups A, B, C (grouped by key1) but 
 then the documents within group A would be grouped by key 2.
 I think this will be important for apps whose documents are the product of 
 denormalizing, ie where the Lucene document is really a sub-document of a 
 different identifier field.  Borrowing an example from LUCENE-3097, you have 
 doctors but each doctor may have multiple offices (addresses) where they 
 practice and so you index doctor X address as your lucene documents.  In this 
 case, your identifier field (that which counts for facets, and should be 
 grouped for presentation) is doctorid.  When you offer users search over 
 this index, you'd likely want to 1) group by distance (ie,  0.1 miles,  0.2 
 miles, etc., as a function query), but 2) also group by doctorid, ie cascaded 
 grouping.
 I suspect this would be easier to implement than it sounds: the per-group 
 collector used by the 2nd pass grouping collector for key1's grouping just 
 needs to be another grouping collector.  Spookily, though, that collection 
 would also have to be 2-pass, so it could get tricky since grouping is sort 
 of recursing on itself once we have LUCENE-3112, though, that should 
 enable efficient single pass grouping by the identifier (doctorid).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3069:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3013) I wish Lucene query explanations were easier to localise

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3013:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 I wish Lucene query explanations were easier to localise
 

 Key: LUCENE-3013
 URL: https://issues.apache.org/jira/browse/LUCENE-3013
 Project: Lucene - Java
  Issue Type: Wish
  Components: core/query/scoring
Reporter: Trejkaz
  Labels: gsoc2012, lucene-gsoc-12

 Often users ask us to provide a nice UI to explain why a document matched 
 their query.  Currently the strings output by Explanation are very advanced, 
 and probably only understandable to those who have worked on Lucene.  I took 
 a shot at trying to make them friendlier, but it basically came down to 
 parsing the strings it output and trying to figure out what kind of query was 
 at each point (the inability to get to a Query from the Explanation is a 
 small part of the problem here), formulating the result into readable 
 English.  In the end it seems a bit too hard.
 The solution to this could be done in at least two ways:
 1. Add getLocalizedSummary() / getLocalizedDescription() method(s) and use 
 resource bundles internally.  Projects wishing to localise these could add 
 their own resource bundles to the classpath and/or get them contributed to 
 Lucene.
 2. Add subclasses of Explanation with enough methods for callers to 
 interrogate the individual details of the explanation instead of outputting 
 it as a monolithic string.
 I do like the tree structure of explanations a lot (as it resembles the query 
 tree), I just think there is work to be done splitting up the strings into 
 usable fragments of information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2948) Make var gap terms index a partial prefix trie

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2948.


Resolution: Won't Fix

I think BlockTree terms dict accomplished the same thing.

 Make var gap terms index a partial prefix trie
 --

 Key: LUCENE-2948
 URL: https://issues.apache.org/jira/browse/LUCENE-2948
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-2948.patch, LUCENE-2948.patch, LUCENE-2948.patch, 
 LUCENE-2948_automaton.patch, Results.png


 Var gap stores (in an FST) the indexed terms (every 32nd term, by
 default), minus their non-distinguishing suffixes.
 However, often times the resulting FST is close to a prefix trie in
 some portion of the terms space.
 By allowing some nodes of the FST to store all outgoing edges,
 including ones that do not lead to an indexed term, and by recording
 that this node is then authoritative as to what terms exist in the
 terms dict from that prefix, we can get some important benefits:
   * It becomes possible to know that a certain term prefix cannot
 exist in the terms index, which means we can save a disk seek in
 some cases (like PK lookup, docFreq, etc.)
   * We can query for the next possible prefix in the index, allowing
 some MTQs (eg FuzzyQuery) to save disk seeks.
 Basically, the terms index is able to answer questions that previously
 required seeking/scanning in the terms dict file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2929) all postings enums must explicitly declare what they need up-front.

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2929:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

Still need requiresPayloads boolean.

 all postings enums must explicitly declare what they need up-front.
 ---

 Key: LUCENE-2929
 URL: https://issues.apache.org/jira/browse/LUCENE-2929
 Project: Lucene - Java
  Issue Type: Task
Reporter: Robert Muir
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0

 Attachments: LUCENE-2929.patch, LUCENE-2929.patch, LUCENE-2929.patch


 Currently, the DocsEnum api assumes you *might* consumes freqs at any time.
 Additionally the DocsAndPositionsEnum api assumes you *might* consume a 
 payload at any time.
 High level things such as queries know what kinds of data they need from the 
 index up-front,
 and the current APIs are limiting to codecs (other than Standard, which has 
 these intertwined).
 So, we either need DocsAndFreqsEnum, DocsPositionsAndPayloadsEnum, or at 
 least booleans
 in the methods that create these to specify whether you want freqs or 
 payloads.
 we did this for freqs in the bulkpostings API, which is good, but these 
 DocsEnum apis
 are also new in 4.0 and there's no reason to introduce non-performant APIs.
 additionally when/if we add payloads to the bulkpostings API, we should make 
 sure we keep
 the same trend and require you to specify you want payloads or not up-front.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2530) rename docsEnum.getBulkResult() to make its role clearer

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2530.


Resolution: Won't Fix

We removed bulk API in 4.0.

 rename docsEnum.getBulkResult() to make its role clearer
 

 Key: LUCENE-2530
 URL: https://issues.apache.org/jira/browse/LUCENE-2530
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.0
Reporter: Andi Vajda
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0


 Before docsEnum.read() can be called a BulkResult instance must be allocated 
 for it (it == the default implementation of that method).
 This is done by calling docsEnum.getBulkResult(). Failure to call this method 
 before read() is called results in a NullPointerException.
 It is somewhat counterintuitive to get the results of an operation before 
 calling said operation.
 Maybe this method should be renamed to something more definite-sounding like 
 obtainBulkResult() or prepareBulkResult() ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2505) The system cannot find the file specified - _0.fdt

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2505.


Resolution: Incomplete

 The system cannot find the file specified - _0.fdt
 --

 Key: LUCENE-2505
 URL: https://issues.apache.org/jira/browse/LUCENE-2505
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 2.4.1
Reporter: Tej Kiran Sharma

 Hi,
 I am using Lucene version 2.4.1 and while i indexing my files i got following 
 exception.
 i set indexwriter as following..
 Directory lucDirectory = FSDirectory.getDirectory(_sIndexPath);
 lucDirectory.setLockFactory(new SimpleFSLockFactory(_sIndexPath));
 lucWriter = new IndexWriter(lucDirectory, true, new 
 KeywordAnalyzer(), true);
 lucWriter.setMergeFactor(10);
 lucWriter.setMaxMergeDocs(2147483647);
 lucWriter.setMaxBufferedDocs(1);
 lucWriter.setRAMBufferSizeMB(32);
 lucWriter.setUseCompoundFile(false);
 I am doing indexing and searching both symultaniously and i am getting 
 following exception  the system cannot find the file specified 
 ERROR Exception while checking size - 
 C:\00scripts\Temp\TempIndex\20104261030775\_0.fdt (The system cannot find the 
 file specified)Stacktrace java.io.FileNotFoundException: 
 C:\00scripts\Temp\TempIndex\20104261030775\_0.fdt (The system cannot find the 
 file specified)   at java.io.RandomAccessFile.open(Native Method) at 
 java.io.RandomAccessFile.init(Unknown Source)  at 
 org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.init(Unknown 
 Source)   at org.apache.lucene.store.FSDirectory$FSIndexInput.init(Unknown 
 Source)  at org.apache.lucene.store.FSDirectory.openInput(Unknown Source) 
at org.apache.lucene.index.FieldsReader.init(Unknown Source)  at 
 org.apache.lucene.index.SegmentReader.initialize(Unknown Source) at 
 org.apache.lucene.index.SegmentReader.get(Unknown Source)at 
 org.apache.lucene.index.SegmentReader.get(Unknown Source)at 
 org.apache.lucene.index.DirectoryIndexReader$1.doBody(Unknown Source)
 at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(Unknown Source)  
   at org.apache.lucene.index.DirectoryIndexReader.open(Unknown Source)at 
 org.apache.lucene.index.IndexReader.open(Unknown Source) at 
 org.apache.lucene.index.IndexReader.open(Unknown Source) at 
 org.apache.lucene.search.IndexSearcher.init(Unknown Source)at 
 com..main.apu.d(Unknown Source)  at com..main.apu.a(Unknown Source)  
 at com.main.arn.a(Unknown Source)   at com.main.abh.b(Unknown Source) 
   at com.main.abh.a(Unknown Source)   at com..main.abh.f(Unknown Source)  
 at com.main.eu.run(Unknown Source)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2441) Create 3.x - 4.0 index migration tool

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2441.


Resolution: Duplicate

We already have IndexUpgrader now.

 Create 3.x - 4.0 index migration tool
 --

 Key: LUCENE-2441
 URL: https://issues.apache.org/jira/browse/LUCENE-2441
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/index
Reporter: Michael McCandless
 Fix For: 4.0


 We need a tool to upgrade an index so that 4.0 can read it.  I think the only 
 change right now is the cutover to flex's standard codec format, but with 
 LUCENE-2426 we also need to correct the term sort order to be true unicode 
 code point order.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2445) Perf improvements for the DocsEnum bulk read API

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2445.


Resolution: Won't Fix

We removed bulk API in 4.0.

 Perf improvements for the DocsEnum bulk read API
 

 Key: LUCENE-2445
 URL: https://issues.apache.org/jira/browse/LUCENE-2445
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
 Fix For: 4.0


 I started to work on LUCENE-2443, to create a test showing the
 problems, but it turns out none of the core codecs (even sep/intblock)
 ever set a non-zero offset.
 So I set forth to fix sep to do so, but ran into some issues w/ the
 current bulk-read API that we should fix to make it higher
 performance:
   * Filtering of deleted docs should be the caller's job (saves an
 extra pass through the docs)
   * Probably docs should arrive as deltas and caller sums these up to
 get the actual docID
   * Whether to load freqs or not should be separately controllable
   * We may want to require that the int[] for docs and freqs are
 aligned, ie the offset into each is the same
   * Maybe we should separate out a BulkDocsEnum from DocsEnum.  We can
 make it optional for codecs (ie, we can emulate BulkDocsEnum from
 the DocsEnum)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2364) Add support for terms in BytesRef format to Term, TermQuery, TermRangeQuery Co.

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2364.


Resolution: Fixed

Term now stores BytesRef internally...

 Add support for terms in BytesRef format to Term, TermQuery, TermRangeQuery  
 Co.
 -

 Key: LUCENE-2364
 URL: https://issues.apache.org/jira/browse/LUCENE-2364
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/search
Affects Versions: 4.0
Reporter: Uwe Schindler
Assignee: Michael McCandless
 Fix For: 4.0


 It would be good to directly allow BytesRefs in TermQuery and TermRangeQuery 
 (as both queries convert the strings to BytesRef internally). For 
 NumericRange support in Solr it will be needed to support numerics as ByteRef 
 in single-term queries.
 When this will be added, don't forget to change TestNumericRangeQueryXX to 
 use the BytesRef ctor of TRQ.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2357) Reduce transient RAM usage while merging by using packed ints array for docID re-mapping

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2357:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 Reduce transient RAM usage while merging by using packed ints array for docID 
 re-mapping
 

 Key: LUCENE-2357
 URL: https://issues.apache.org/jira/browse/LUCENE-2357
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Priority: Minor
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0


 We allocate this int[] to remap docIDs due to compaction of deleted ones.
 This uses alot of RAM for large segment merges, and can fail to allocate due 
 to fragmentation on 32 bit JREs.
 Now that we have packed ints, a simple fix would be to use a packed int 
 array... and maybe instead of storing abs docID in the mapping, we could 
 store the number of del docs seen so far (so the remap would do a lookup then 
 a subtract).  This may add some CPU cost to merging but should bring down 
 transient RAM usage quite a bit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2334) IndexReader.close() should call IndexReader.decRef() unconditionally ??

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2334.


Resolution: Won't Fix

 IndexReader.close() should call IndexReader.decRef() unconditionally ??
 ---

 Key: LUCENE-2334
 URL: https://issues.apache.org/jira/browse/LUCENE-2334
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 3.0.1
Reporter: Mike Hanafey
Priority: Minor

 IndexReader.close() is defined:
 {code}  /**
* Closes files associated with this index.
* Also saves any new deletions to disk.
* No other methods should be called after this has been called.
* @throws IOException if there is a low-level IO error
*/
   public final synchronized void close() throws IOException {
 if (!closed) {
   decRef();
   closed = true;
 }
   }
 {code}
 This  means that  if the refCount is bigger than one, close() does not 
 actually close, but it is also true that calling close() again has no effect.
 Why does close() not simply call decRef() unconditionally? This way if 
 incRef() is called each time an instance of IndexReader were handed out, if 
 close() is called by each recipient when they are done, the last one to call 
 close will actually close the index. As written it seems the API is very 
 confusing -- the first close() does one thing, but the next close() does 
 something different.
 At a minimum the JavaDoc should clarify the behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2310) Reduce Fieldable, AbstractField and Field complexity

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2310.


   Resolution: Fixed
Fix Version/s: 4.0

 Reduce Fieldable, AbstractField and Field complexity
 

 Key: LUCENE-2310
 URL: https://issues.apache.org/jira/browse/LUCENE-2310
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: core/index
Reporter: Chris Male
 Fix For: 4.0

 Attachments: LUCENE-2310-Deprecate-AbstractField-CleanField.patch, 
 LUCENE-2310-Deprecate-AbstractField.patch, 
 LUCENE-2310-Deprecate-AbstractField.patch, 
 LUCENE-2310-Deprecate-AbstractField.patch, 
 LUCENE-2310-Deprecate-DocumentGetFields-core.patch, 
 LUCENE-2310-Deprecate-DocumentGetFields.patch, 
 LUCENE-2310-Deprecate-DocumentGetFields.patch, LUCENE-2310.patch


 In order to move field type like functionality into its own class, we really 
 need to try to tackle the hierarchy of Fieldable, AbstractField and Field.  
 Currently AbstractField depends on Field, and does not provide much more 
 functionality that storing fields, most of which are being moved over to 
 FieldType.  Therefore it seems ideal to try to deprecate AbstractField (and 
 possible Fieldable), moving much of the functionality into Field and 
 FieldType.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2338) Some tests catch Exceptions in separate threads and just print a stack trace - the test does not fail

2012-03-20 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233498#comment-13233498
 ] 

Uwe Schindler commented on LUCENE-2338:
---

Were all tests already converted to not supress exceptions in threads? This is 
why the issue is still open...

 Some tests catch Exceptions in separate threads and just print a stack trace 
 - the test does not fail
 -

 Key: LUCENE-2338
 URL: https://issues.apache.org/jira/browse/LUCENE-2338
 Project: Lucene - Java
  Issue Type: Test
  Components: general/build
Reporter: Uwe Schindler
 Fix For: 3.6, 4.0


 Some tests catch Exceptions in separate threads and just print a stack trace 
 - the test does not fail. The test should fail. Since LUCENE-2274, the 
 LuceneTestCase(J4) class installs an UncaughtExceptionHandler, so this type 
 of catching and solely printing a Stack trace is a bad idea. Problem is, that 
 the run() method of threads is not allowed to throw checked Exceptions.
 Two possibilities:
 - Catch checked Exceptions in the run() method and wrap into RuntimeException 
 or call Assert.fail() instead
 - Use Executors

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2276) Add IndexReader.document(int, Document, FieldSelector)

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2276.


   Resolution: Duplicate
Fix Version/s: 4.0

The StoredFieldVisitor API (4.0) makes this possible...

 Add IndexReader.document(int, Document, FieldSelector)
 --

 Key: LUCENE-2276
 URL: https://issues.apache.org/jira/browse/LUCENE-2276
 Project: Lucene - Java
  Issue Type: Wish
  Components: core/search
Reporter: Tim Smith
 Fix For: 4.0

 Attachments: LUCENE-2276+2539.patch, LUCENE-2276.patch


 The Document object passed in would be populated with the fields identified 
 by the FieldSelector for the specified internal document id
 This method would allow reuse of Document objects when retrieving stored 
 fields from the index

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-2120) Possible file handle leak in near real-time reader

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-2120.


Resolution: Cannot Reproduce

 Possible file handle leak in near real-time reader
 --

 Key: LUCENE-2120
 URL: https://issues.apache.org/jira/browse/LUCENE-2120
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 3.1
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0


 Spinoff of LUCENE-1526: Jake/John hit file descriptor exhaustion when testing 
 NRT.
 I've tried to repro this, stress testing NRT, saturating reopens, indexing, 
 searching, but haven't found any issue.
 Let's try to get to the bottom of it, here...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-2082) Performance improvement for merging posting lists

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-2082:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 Performance improvement for merging posting lists
 -

 Key: LUCENE-2082
 URL: https://issues.apache.org/jira/browse/LUCENE-2082
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Reporter: Michael Busch
Priority: Minor
  Labels: gsoc2012, lucene-gsoc-12
 Fix For: 4.0


 A while ago I had an idea about how to improve the merge performance
 for posting lists. This is currently by far the most expensive part of
 segment merging due to all the VInt de-/encoding. Not sure if an idea
 for improving this was already mentioned in the past?
 So the basic idea is it to perform a raw copy of as much posting data
 as possible. The reason why this is difficult is that we have to
 remove deleted documents. But often the fraction of deleted docs in a
 segment is rather low (10%?), so it's likely that there are quite
 long consecutive sections without any deletions.
 To find these sections we could use the skip lists. Basically at any
 point during the merge we would find the skip entry before the next
 deleted doc. All entries to this point can be copied without
 de-/encoding of the VInts. Then for the section that has deleted docs
 we perform the normal way of merging to remove the deletes. Then we
 check again with the skip lists if we can raw copy the next section.
 To make this work there are a few different necessary changes:
 1) Currently the multilevel skiplist reader/writer can only deal with 
 fixed-size
 skips (16 on the lowest level). It would be an easy change to allow
 variable-size skips, but then the MultiLevelSkipListReader can't
 return numSkippedDocs anymore, which SegmentTermDocs needs - change 2)
 2) Store the last docID in which a term occurred in the term
 dictionary. This would also be beneficial for other use cases. By
 doing that the SegmentTermDocs#next(), #read() and #skipTo() know when
 the end of the postinglist is reached. Currently they have to track
 the df, which is why after a skip it's important to take the
 numSkippedDocs into account.
 3) Change the merging algorithm according to my description above. It's
 important to create a new skiplist entry at the beginning of every
 block that is copied in raw mode, because its next skip entry's values
 are deltas from the beginning of the block. Also the very first posting, and
 that one only, needs to be decoded/encoded to make sure that the
 payload length is explicitly written (i.e. must not depend on the
 previous length). Also such a skip entry has to be created at the
 beginning of each source segment's posting list. With change 2) we don't
 have to worry about the positions of the skip entries. And having a few
 extra skip entries in merged segments won't hurt much.
 If a segment has no deletions at all this will avoid any
 decoding/encoding of VInts (best case). I think it will also work
 great for segments with a rather low amount of deletions. We should
 probably then have a threshold: if the number of deletes exceeds this
 threshold we should fall back to old style merging.
 I haven't implemented any of this, so there might be complications I
 haven't thought about. Please let me know if you can think of reasons
 why this wouldn't work or if you think more changes are necessary.
 I will probably not have time to work on this soon, but I wanted to
 open this issue to not forget about it :). Anyone should feel free to
 take this!
 Btw: I think the flex-indexing branch would be a great place to try this
 out as a new codec. This would also be good to figure out what APIs
 are needed to make merging fully flexible as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-1948) Deprecating InstantiatedIndexWriter

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1948.


Resolution: Fixed

 Deprecating InstantiatedIndexWriter
 ---

 Key: LUCENE-1948
 URL: https://issues.apache.org/jira/browse/LUCENE-1948
 Project: Lucene - Java
  Issue Type: Task
  Components: modules/other
Affects Versions: 2.9
Reporter: Karl Wettin
Assignee: Karl Wettin
 Fix For: 4.0

 Attachments: LUCENE-1948.patch


 http://markmail.org/message/j6ip266fpzuaibf7
 I suppose that should have been suggested before 2.9 rather than  
 after...
 There are at least three reasons to why I want to do this:
 The code is based on the behaviour or the Directory IndexWriter as of  
 2.3 and I have not been touching it since then. If there will be  
 changes in the future one will have to keep IIW in sync, something  
 that's easy to forget.
 There is no locking which will cause concurrent modification  
 exceptions when accessing the index via searcher/reader while  
 committing.
 It use the old token stream API so it has to be upgraded in case it  
 should stay.
 The java- and package level docs have since it was committed been  
 suggesting that one should consider using II as if it was immutable  
 due to the locklessness. My suggestion is that we make it immutable  
 for real.
 Since II is ment for small corpora there is very little time lost by  
 using the constructor that builts the index from an IndexReader. I.e.  
 rather than using InstantiatedIndexWriter one would have to use a  
 Directory and an IndexWriter and then pass an IndexReader to a new  
 InstantiatedIndex.
 Any objections?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-1922) exposing the ability to get the number of unique term count per field

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1922.


   Resolution: Duplicate
Fix Version/s: 2.9

Fixed in LUCENE-1586.

 exposing the ability to get the number of unique term count per field
 -

 Key: LUCENE-1922
 URL: https://issues.apache.org/jira/browse/LUCENE-1922
 Project: Lucene - Java
  Issue Type: New Feature
  Components: core/index
Affects Versions: 4.0
Reporter: John Wang
 Fix For: 4.0, 2.9


 Add an api to get the number of unique term count given a field name, e.g.:
 IndexReader.getUniqueTermCount(String field)
 This issue has a dependency on LUCENE-1458

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1761) low level Field metadata is never removed from index

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1761:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 low level Field metadata is never removed from index
 

 Key: LUCENE-1761
 URL: https://issues.apache.org/jira/browse/LUCENE-1761
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 2.2, 2.3, 2.3.1, 2.3.2, 2.4, 2.4.1
Reporter: Hoss Man
Priority: Minor
  Labels: gsoc2012, lucene-gsoc-12
 Attachments: LUCENE-1761.patch


 with heterogeneous docs, or an index whose fields evolve over time, field 
 names that are no longer used (ie: all docs that ever referenced them have 
 been deleted) still show up when you use IndexReader.getFieldNames.
 It seems logical that segment merging should only preserve metadata about 
 fields that actually existing the new segment, but even after deleting all 
 documents from an index and optimizing the old field names are still present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-1750) Create a MergePolicy that limits the maximum size of it's segments

2012-03-20 Thread Michael McCandless (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-1750.


   Resolution: Duplicate
Fix Version/s: 3.2

TieredMergePolicy does this...

 Create a MergePolicy that limits the maximum size of it's segments
 --

 Key: LUCENE-1750
 URL: https://issues.apache.org/jira/browse/LUCENE-1750
 Project: Lucene - Java
  Issue Type: Improvement
  Components: core/index
Affects Versions: 2.4.1
Reporter: Jason Rutherglen
Priority: Minor
 Fix For: 4.0, 3.2

 Attachments: LUCENE-1750.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 Basically I'm trying to create largish 2-4GB shards using
 LogByteSizeMergePolicy, however I've found in the attached unit
 test segments that exceed maxMergeMB.
 The goal is for segments to be merged up to 2GB, then all
 merging to that segment stops, and then another 2GB segment is
 created. This helps when replicating in Solr where if a single
 optimized 60GB segment is created, the machine stops working due
 to IO and CPU starvation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1252) Avoid using positions when not all required terms are present

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1252:
---

Labels: gsoc2012 lucene-gsoc-12  (was: )

 Avoid using positions when not all required terms are present
 -

 Key: LUCENE-1252
 URL: https://issues.apache.org/jira/browse/LUCENE-1252
 Project: Lucene - Java
  Issue Type: Wish
  Components: core/search
Reporter: Paul Elschot
Priority: Minor
  Labels: gsoc2012, lucene-gsoc-12

 In the Scorers of queries with (lots of) Phrases and/or (nested) Spans, 
 currently next() and skipTo() will use position information even when other 
 parts of the query cannot match because some required terms are not present.
 This could be avoided by adding some methods to Scorer that relax the 
 postcondition of next() and skipTo() to something like all required terms 
 are present, but no position info was checked yet, and implementing these 
 methods for Scorers that do conjunctions: BooleanScorer, PhraseScorer, and 
 SpanScorer/NearSpans.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-1000) queryparsersyntax.html escaping section needs beefed up

2012-03-20 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-1000:
---

Labels: newdev  (was: )

 queryparsersyntax.html escaping section needs beefed up
 ---

 Key: LUCENE-1000
 URL: https://issues.apache.org/jira/browse/LUCENE-1000
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/website
Reporter: Hoss Man
  Labels: newdev
 Fix For: 4.0


 the query syntax documentation is currently lacking several key pieces of 
 info:
  1) that unicode style escapes are valid
  2) that any character can be escaped with a backslash, not just special 
 chars.
 ..we should probably beef up the Escaping Special Characters section

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-3260) Improve exception handling / logging for ScriptTranformer.init()

2012-03-20 Thread James Dyer (Created) (JIRA)
Improve exception handling / logging for ScriptTranformer.init()


 Key: SOLR-3260
 URL: https://issues.apache.org/jira/browse/SOLR-3260
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.5, 4.0
Reporter: James Dyer
Assignee: James Dyer
Priority: Trivial
 Fix For: 3.6, 4.0


This came up on the user-list.  ScriptTransformer logs the same need a =1.6 
jre message for several problems, making debugging difficult for users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   3   >