Re: ivy.

2012-03-31 Thread Dawid Weiss
Thanks guys. I'll try it out later and let you know how it works.

Dawid

On Fri, Mar 30, 2012 at 11:50 PM, Uwe Schindler u...@thetaphi.de wrote:
 It can also go into ivysettings.xml (see example: 
 http://ant.apache.org/ivy/history/latest-milestone/settings.html) and you can 
 pass via properties where this file is, if not in classpath).

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Greg Bowyer [mailto:gbow...@fastmail.co.uk]
 Sent: Friday, March 30, 2012 11:43 PM
 To: dev@lucene.apache.org
 Subject: Re: ivy.

 I am pretty sure this needs to go into the ivy.xml in lucene proper

 On 30/03/12 13:46, Dawid Weiss wrote:
  Ah, that's what I was looking for. But where do I put it? Can we set
  it globally in Lucene so that others (who have .m2 repos) can make
  immediate use of their preloaded artefacts?
 
  D.
 
  On Fri, Mar 30, 2012 at 10:44 PM, Greg Bowyergbow...@fastmail.co.uk
 wrote:
  You can get ivy to treat the local maven repo as a resolver host I
  think the required config is along the lines of
 
   %  
  resolvers
  filesystem name=local-maven-2 m2compatible=true force=false
  local=true
  artifact
  pattern=${gerald.repo.dir}/[organisation]/[module]/[revision]/[modul
  e]-[revision].[ext]/
  ivy
  pattern=${gerald.repo.dir}/[organisation]/[module]/[revision]/[modul
  e]-[revision].pom/
  /filesystem
  /resolvers
     ...
  /settings
 
  chain name=whatever dual=true
           checkmodified=true changingPattern=.*SNAPSHOT resolver
  ref=local-maven-2/ resolver ref=apache-snapshot/ resolver
  ref=maven2/
      ...
  /chain
 
  % 
 
  -- Greg
 
 
  On 30/03/12 13:27, Dawid Weiss wrote:
  But honestly, i have no idea how ivy works. its just like ant to
  me. i just hack and hack and hack until it works.
  You're a live randomized solver!
 
  Dawid
 
  
  - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3940) When Japanese (Kuromoji) tokenizer removes a punctuation token it should leave a hole

2012-03-31 Thread Michael McCandless (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3940:
---

Attachment: LUCENE-3940.patch

New patch, fixing a bug in the last one, and adding a few more test cases.  I 
also made the print curious string on exception from BTSTC more 
ascii-friendly.

I think it's ready.

 When Japanese (Kuromoji) tokenizer removes a punctuation token it should 
 leave a hole
 -

 Key: LUCENE-3940
 URL: https://issues.apache.org/jira/browse/LUCENE-3940
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3940.patch, LUCENE-3940.patch


 I modified BaseTokenStreamTestCase to assert that the start/end
 offsets match for graph (posLen  1) tokens, and this caught a bug in
 Kuromoji when the decompounding of a compound token has a punctuation
 token that's dropped.
 In this case we should leave hole(s) so that the graph is intact, ie,
 the graph should look the same as if the punctuation tokens were not
 initially removed, but then a StopFilter had removed them.
 This also affects tokens that have no compound over them, ie we fail
 to leave a hole today when we remove the punctuation tokens.
 I'm not sure this is serious enough to warrant fixing in 3.6 at the
 last minute...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3932) Improve load time of .tii files

2012-03-31 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243102#comment-13243102
 ] 

Michael McCandless commented on LUCENE-3932:


bq. Is the space savings of delta encoding worth the processing time? You could 
write the .tii file to disk such that on open you could read it straight into a 
byte[].

This is actually what we do in 4.0's default codec (the index is an FST).

It is tempting to do that in 3.x (if we were to do another 3.x release after 
3.6) ... we'd need to alter other things as well, eg the term bytes are also 
delta-coded in the file but not in RAM.

I'm curious how much larger it'd be if we stopped delta coding... for your 
case, how large is the byte[] in RAM (just call dataPagedBytes.getPointer(), 
just before we freeze it, and print that result) vs the tii on disk...?

 Improve load time of .tii files
 ---

 Key: LUCENE-3932
 URL: https://issues.apache.org/jira/browse/LUCENE-3932
 Project: Lucene - Java
  Issue Type: Improvement
Affects Versions: 3.5
 Environment: Linux
Reporter: Sean Bridges
 Attachments: LUCENE-3932.trunk.patch, perf.csv


 We have a large 50 gig index which is optimized as one segment, with a 66 MEG 
 .tii file.  This index has no norms, and no field cache.
 It takes about 5 seconds to load this index, profiling reveals that 60% of 
 the time is spent in GrowableWriter.set(index, value), and most of time in 
 set(...) is spent resizing PackedInts.Mutatable current.
 In the constructor for TermInfosReaderIndex, you initialize the writer with 
 the line,
 {quote}GrowableWriter indexToTerms = new GrowableWriter(4, indexSize, 
 false);{quote}
 For our index using four as the bit estimate results in 27 resizes.
 The last value in indexToTerms is going to be ~ tiiFileLength, and if instead 
 you use,
 {quote}int bitEstimate = (int) Math.ceil(Math.log10(tiiFileLength) / 
 Math.log10(2));
 GrowableWriter indexToTerms = new GrowableWriter(bitEstimate, indexSize, 
 false);{quote}
 Load time improves to ~ 2 seconds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: conditional High Freq Terms in Lucene index

2012-03-31 Thread Michael McCandless
One big problem is your collector (that gathers all A doc IDs) is
not mapping the per-segment docID to the top-level global docID space.

You need to save the docBase that was passed to setNextReader, and
then add it back in on each collect call.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Mar 30, 2012 at 7:23 PM, starz10de farag_ah...@yahoo.com wrote:
 Thanks for your hint.

 I tried simple solution as following:
 Firstly I determine the document type “A” and stored them in an array by
 searching the field document type in the index:
 public static void doStreamingSearch(final Searcher searcher, Query query)
                        throws IOException {


                Collector streamingHitCollector = new Collector() {
                        // simply print docId and score of every matching 
 document
                        @Override
                        public void collect(int doc) throws IOException {
                                c++;
                        //      System.out.println(doc= + doc);

                                doc_id.add(doc+);
                                //  System.out.println(doc= + doc  );
                                // scorer.score());
                        }

                        @Override
                        public boolean acceptsDocsOutOfOrder() {
                                return true;
                        }

                        @Override
                        public void setNextReader(IndexReader arg0, int arg1)
                                        throws IOException {
                                // TODO Auto-generated method stub

                        }

                        @Override
                        public void setScorer(Scorer arg0) throws IOException {
                                // TODO Auto-generated method stub

                        }

                };

                 searcher.search(query, streamingHitCollector);

        }
 Then I modified the HighFrequentTerm in lucene as follows:
 while (terms.next()) {

      dok.seek(terms);

        while (dok.next()) {



                  for(int i=0;i doc_id.size();++i)
                         {

                    if( doc_id.get(i).equals(dok.doc()+))
                    {
                         if (terms.term().field().equals(field)  ) {

 tiq.insertWithOverflow(new TermInfo(terms.term(), dok.freq()));
                                }

                    }
 I could test that i correctly have only the document type „A“. However, the
 result is not correct because I can see few terms twice in the ordered high
 frequent list.

 Any hints where are the problem?

 Michael McCandless-2 wrote

 You'd have to modify HighFreqTerm's sources...

 Roughly...

 First, make a bitset recording which docs are type A (eg, use
 FieldCache), second, change HighFreqTerms so that for each term, it
 walks the postings, counting how many type A docs there were, then...
 just use the rest of HighFreqTerms (priority queue, etc.).

 Mike McCandless

 http://blog.mikemccandless.com

 On Thu, Mar 29, 2012 at 11:33 AM, starz10de farag_ahmed@ wrote:
 HI,

 I am using HighFreqTerms class to compute the high frequent terms in the
 Lucene index and it works well. However, I am interested to compute the
 high
 frequent terms under some condition. I would like to compute the high
 frequent terms not for all documents in the index instead only for
 documents
 with type “A”. Beside the “contents” field in the index I have also the
 “DocType” (document type) in the index as extra field.
 So I should compute the high frequent term only  (if DocType=”A”)

 Any idea how to do this?

 Thanks

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/conditional-High-Freq-Terms-in-Lucene-index-tp3868066p3868066.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscribe@.apache
 For additional commands, e-mail: dev-help@.apache


 -
 To unsubscribe, e-mail: dev-unsubscribe@.apache
 For additional commands, e-mail: dev-help@.apache


 Michael McCandless-2 wrote

 You'd have to modify HighFreqTerm's sources...

 Roughly...

 First, make a bitset recording which docs are type A (eg, use
 FieldCache), second, change HighFreqTerms so that for each term, it
 walks the postings, counting how many type A docs there were, then...
 just use the rest of HighFreqTerms (priority queue, etc.).

 Mike McCandless

 http://blog.mikemccandless.com

 On Thu, Mar 29, 2012 at 11:33 AM, starz10de farag_ahmed@ wrote:
 HI,

 I am using HighFreqTerms class to compute the high frequent terms in the
 Lucene index and it works well. However, I am interested to compute the
 high
 frequent terms under some condition. I would like to compute the high
 frequent terms not for all documents in 

[jira] [Commented] (LUCENE-3939) ClassCastException thrown in the map(String,int,TermVectorOffsetInfo[],int[]) method in org.apache.lucene.index.SortedTermVectorMapper

2012-03-31 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243108#comment-13243108
 ] 

Michael McCandless commented on LUCENE-3939:


I'm confused on how something's that not a TermVectorEntry can get into the 
termToTVE map... can you post a small test case showing this problem?

 ClassCastException thrown in the map(String,int,TermVectorOffsetInfo[],int[]) 
 method in org.apache.lucene.index.SortedTermVectorMapper
 --

 Key: LUCENE-3939
 URL: https://issues.apache.org/jira/browse/LUCENE-3939
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 3.0.2, 3.1, 3.4, 3.5
Reporter: SHIN HWEI TAN
   Original Estimate: 0.05h
  Remaining Estimate: 0.05h

 The method map in the SortedTermVectorMapper class does not check the 
 parameter term for the valid values. It throws ClassCastException when 
 called with a invalid string for the parameter term (i.e., var3.map(*, 
 (-1), null, null)). The exception thrown is due to an explict cast(i.e., 
 casting the return value of termToTVE.get(term) to type TermVectorEntry). 
 Suggested Fixes: Replace the beginning of the method body for the class 
 SortedTermVectorMapper by changing it like this:
 public void map(String term, int frequency, TermVectorOffsetInfo[] offsets, 
 int[] positions) {
   if(termToTVE.get(term) instanceof TermVectorEntry){
   TermVectorEntry entry = (TermVectorEntry) termToTVE.get(term);
   ...
   }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3738:
--

Attachment: LUCENE-3738-improvement.patch

After looking a while on the code, I have a further minor improvement. The most 
common case (int  128) now exits directly after reading the byte without any  
or variable assignment operations.

Mike: Can you look at it and maybe do a quick test? I would like to commit this 
this evening to both branches.

 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Uwe Schindler (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-3738:
--

Attachment: (was: LUCENE-3738.patch)

 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Reopened] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Uwe Schindler (Reopened) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reopened LUCENE-3738:
---


 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243125#comment-13243125
 ] 

Michael McCandless commented on LUCENE-3738:


Thanks Uwe, I'll test!

 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3940) When Japanese (Kuromoji) tokenizer removes a punctuation token it should leave a hole

2012-03-31 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243126#comment-13243126
 ] 

Robert Muir commented on LUCENE-3940:
-

I dont think we should do this. StandardTokenizer doesnt leave holes when it 
drops punctuation,
I think holes should only be real 'words' for the most part

 When Japanese (Kuromoji) tokenizer removes a punctuation token it should 
 leave a hole
 -

 Key: LUCENE-3940
 URL: https://issues.apache.org/jira/browse/LUCENE-3940
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3940.patch, LUCENE-3940.patch


 I modified BaseTokenStreamTestCase to assert that the start/end
 offsets match for graph (posLen  1) tokens, and this caught a bug in
 Kuromoji when the decompounding of a compound token has a punctuation
 token that's dropped.
 In this case we should leave hole(s) so that the graph is intact, ie,
 the graph should look the same as if the punctuation tokens were not
 initially removed, but then a StopFilter had removed them.
 This also affects tokens that have no compound over them, ie we fail
 to leave a hole today when we remove the punctuation tokens.
 I'm not sure this is serious enough to warrant fixing in 3.6 at the
 last minute...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Solr-trunk - Build # 1811 - Still Failing

2012-03-31 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Solr-trunk/1811/

1 tests failed.
FAILED:  org.apache.solr.TestDistributedSearch.testDistribSearch

Error Message:
Uncaught exception by thread: Thread[Thread-733,5,]

Stack Trace:
org.apache.lucene.util.UncaughtExceptionsRule$UncaughtExceptionsInBackgroundThread:
 Uncaught exception by thread: Thread[Thread-733,5,]
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:84)
at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
Caused by: java.lang.RuntimeException: 
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://localhost:32132/solr
at 
org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:396)
Caused by: org.apache.solr.client.solrj.SolrServerException: IOException 
occured when talking to server at: http://localhost:32132/solr
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:361)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:209)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:90)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:312)
at 
org.apache.solr.TestDistributedSearch$1.run(TestDistributedSearch.java:391)
Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to 
localhost:32132 timed out
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:125)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
at 
org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:150)
at 
org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:575)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:425)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:304)
... 4 more




Build Log (for compile errors):
[...truncated 10479 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: conditional High Freq Terms in Lucene index

2012-03-31 Thread starz10de
I revised it including your comment:



private Scorer scorer;
private int docBase;

// simply print docId and score of every matching 
document
@Override
public void collect(int doc) throws IOException {

String k=doc+;
String k1=docBase+;


  doc_ids.add(k+k1);

 

}

@Override
public boolean acceptsDocsOutOfOrder() {
  return true;
}

@Override
public void setNextReader(IndexReader reader, int 
docBase)
throws IOException {
  this.docBase = docBase;
}

@Override
public void setScorer(Scorer scorer) throws IOException 
{
  this.scorer = scorer;
}

  
I could see in the highFrequentTerm that the condition for the document
type A is performed. However, the highFrequent term isnot computed
correctly, I still see duplicate term in the list beside wrong occuerence.

here how I do it:

TermInfoQueue tiq = new TermInfoQueue(numTerms);
TermEnum terms = reader.terms();
TermDocs dok =null; 
int k=0;
dok = reader.termDocs(); 
if (field != null) { 
  while (terms.next()) { 
  

  k=0;
  
  dok.seek(terms);
 
while (dok.next()) {  
 

   
//System.out.println(dok.doc());
  for(int i=0;i doc_ids.size();++i)
 {  

   
if(categorization_based_on_year.doc_ids.get(i).equals(dok.doc()+))
{

// here I can see that only doc ids for the type A is printed

System.out.println(dok.doc());

 if (terms.term().field().equals(field)   ) {
   tiq.insertWithOverflow(new TermInfo(terms.term(),
dok.freq()));
}
 
   i=1;
}
 
 }   
.
.
.

any hint ?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/conditional-High-Freq-Terms-in-Lucene-index-tp3868066p3873362.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243138#comment-13243138
 ] 

Michael McCandless commented on LUCENE-3738:


Alas, the results are now all over the place!  And I went back to the prior 
patch and tried to reproduce the above results... and the results are still all 
over the place.  I think we are chasing Java ghosts at this point...

 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: conditional High Freq Terms in Lucene index

2012-03-31 Thread Michael McCandless
Hmm, you are adding two strings.  You should first add the two ints
(docBase + doc), then convert that to a string.

Mike McCandless

http://blog.mikemccandless.com

On Sat, Mar 31, 2012 at 8:56 AM, starz10de farag_ah...@yahoo.com wrote:
 I revised it including your comment:



                        private Scorer scorer;
                        private int docBase;

                        // simply print docId and score of every matching 
 document
                        @Override
                        public void collect(int doc) throws IOException {

 String k=doc+;
 String k1=docBase+;


                                  doc_ids.add(k+k1);



                        }

                        @Override
                        public boolean acceptsDocsOutOfOrder() {
                          return true;
                        }

                        @Override
                        public void setNextReader(IndexReader reader, int 
 docBase)
                            throws IOException {
                          this.docBase = docBase;
                        }

                        @Override
                        public void setScorer(Scorer scorer) throws 
 IOException {
                          this.scorer = scorer;
                        }


        I could see in the highFrequentTerm that the condition for the document
 type A is performed. However, the highFrequent term isnot computed
 correctly, I still see duplicate term in the list beside wrong occuerence.

 here how I do it:

 TermInfoQueue tiq = new TermInfoQueue(numTerms);
    TermEnum terms = reader.terms();
    TermDocs dok =null;
    int k=0;
    dok = reader.termDocs();
    if (field != null) {
      while (terms.next()) {


          k=0;

      dok.seek(terms);

        while (dok.next()) {



                //System.out.println(dok.doc());
                  for(int i=0;i doc_ids.size();++i)
                         {


 if(categorization_based_on_year.doc_ids.get(i).equals(dok.doc()+))
                    {

 // here I can see that only doc ids for the type A is printed

 System.out.println(dok.doc());

                         if (terms.term().field().equals(field)   ) {
                       tiq.insertWithOverflow(new TermInfo(terms.term(),
 dok.freq()));
                                }

               i=1;
                    }

                 }
 .
 .
 .

 any hint ?

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/conditional-High-Freq-Terms-in-Lucene-index-tp3868066p3873362.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243143#comment-13243143
 ] 

Uwe Schindler commented on LUCENE-3738:
---

What does your comment mean? Good or bad?

 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3774) check-legal isn't doing its job

2012-03-31 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243153#comment-13243153
 ] 

Yonik Seeley commented on LUCENE-3774:
--

bq. I have a different view on this. Things like this (license checking) are 
typically integration tests. Having them per-module only complicates build 
files and is an unnecessary overhead for running normal tests (because 
dependencies change rarely).

+1

Having been bit by the changes in this issue dozens of times already, we 
shouldn't be doing these checks on a normal ant test.  Seems like it should 
be fine to let Jenkins test it.

* SolrCloud demo instructions that have you make a copy of example it example2, 
etc.
* mv build build.old so I could compare two runs
* try out a new jar locally w/o dotting all the i's

I've seen users report these errors on the mailing list too, and it's not 
apparent to them what the issue is.

 check-legal isn't doing its job
 ---

 Key: LUCENE-3774
 URL: https://issues.apache.org/jira/browse/LUCENE-3774
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Affects Versions: 3.6, 4.0
Reporter: Steven Rowe
Assignee: Dawid Weiss
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, 
 LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE3774.patch, 
 backport.patch


 In trunk, the {{check-legal-lucene}} ant target is not checking any 
 {{lucene/contrib/\*\*/lib/}} directories; the {{modules/**/lib/}} directories 
 are not being checked; and {{check-legal-solr}} can't be checking 
 {{solr/example/lib/\*\*/\*.jar}}, because there are currently {{.jar}} files 
 in there that don't have a license.
 These targets are set up to take in a full list of {{lib/}} directories in 
 which to check, but modules move around, and these lists are not being kept 
 up-to-date.
 Instead, {{check-legal-\*}} should run for each module, if the module has a 
 {{lib/}} directory, and it should be specialized for modules that have more 
 than one ({{solr/core/}}) or that have a {{lib/}} directory in a non-standard 
 place ({{lucene/core/}}).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3774) check-legal isn't doing its job

2012-03-31 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243160#comment-13243160
 ] 

Robert Muir commented on LUCENE-3774:
-

I agree too I think, its worse now that checking licenses means we have to 
resolve first,
to ensure the jars actually exist. this adds overhead, maybe jenkins is good 
enough?
it runs many times a day and we don't actually change jars that often: most of 
the time
when developing we are just changing code...

 check-legal isn't doing its job
 ---

 Key: LUCENE-3774
 URL: https://issues.apache.org/jira/browse/LUCENE-3774
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Affects Versions: 3.6, 4.0
Reporter: Steven Rowe
Assignee: Dawid Weiss
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, 
 LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE3774.patch, 
 backport.patch


 In trunk, the {{check-legal-lucene}} ant target is not checking any 
 {{lucene/contrib/\*\*/lib/}} directories; the {{modules/**/lib/}} directories 
 are not being checked; and {{check-legal-solr}} can't be checking 
 {{solr/example/lib/\*\*/\*.jar}}, because there are currently {{.jar}} files 
 in there that don't have a license.
 These targets are set up to take in a full list of {{lib/}} directories in 
 which to check, but modules move around, and these lists are not being kept 
 up-to-date.
 Instead, {{check-legal-\*}} should run for each module, if the module has a 
 {{lib/}} directory, and it should be specialized for modules that have more 
 than one ({{solr/core/}}) or that have a {{lib/}} directory in a non-standard 
 place ({{lucene/core/}}).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3774) check-legal isn't doing its job

2012-03-31 Thread Dawid Weiss (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243162#comment-13243162
 ] 

Dawid Weiss commented on LUCENE-3774:
-

I'm for pushing it to the top level. This will simplify handling of exceptional 
patterns and such too. Shouldn't be much of a problem to move it too.

 check-legal isn't doing its job
 ---

 Key: LUCENE-3774
 URL: https://issues.apache.org/jira/browse/LUCENE-3774
 Project: Lucene - Java
  Issue Type: Improvement
  Components: general/build
Affects Versions: 3.6, 4.0
Reporter: Steven Rowe
Assignee: Dawid Weiss
 Fix For: 3.6, 4.0

 Attachments: LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, 
 LUCENE-3774.patch, LUCENE-3774.patch, LUCENE-3774.patch, LUCENE3774.patch, 
 backport.patch


 In trunk, the {{check-legal-lucene}} ant target is not checking any 
 {{lucene/contrib/\*\*/lib/}} directories; the {{modules/**/lib/}} directories 
 are not being checked; and {{check-legal-solr}} can't be checking 
 {{solr/example/lib/\*\*/\*.jar}}, because there are currently {{.jar}} files 
 in there that don't have a license.
 These targets are set up to take in a full list of {{lib/}} directories in 
 which to check, but modules move around, and these lists are not being kept 
 up-to-date.
 Instead, {{check-legal-\*}} should run for each module, if the module has a 
 {{lib/}} directory, and it should be specialized for modules that have more 
 than one ({{solr/core/}}) or that have a {{lib/}} directory in a non-standard 
 place ({{lucene/core/}}).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #443: POMs out of sync

2012-03-31 Thread Dyer, James
I tried this seed on my 4-core Windows machine several times but no failure.  
This test failure might indicate that the DIH threading bugs aren't really 
fixed in 3.6.  On the other hand, users of DIH threads on 3.6 will get a 
deprecation warning, the wiki discourages it and the feature is gone in 4.0.  

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-Original Message-
From: Apache Jenkins Server [mailto:jenk...@builds.apache.org] 
Sent: Saturday, March 31, 2012 8:45 AM
To: dev@lucene.apache.org
Subject: [JENKINS-MAVEN] Lucene-Solr-Maven-3.x #443: POMs out of sync

Build: https://builds.apache.org/job/Lucene-Solr-Maven-3.x/443/

1 tests failed.
REGRESSION:  
org.apache.solr.handler.dataimport.TestThreaded.testCachedThread_FullImport

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:409)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:376)
at 
org.apache.solr.handler.dataimport.TestThreaded.verify(TestThreaded.java:73)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:36)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61)
at 
org.apache.lucene.util.LuceneTestCase$SubclassSetupTeardownRule$1.evaluate(LuceneTestCase.java:630)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:536)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:457)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:508)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:146)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:61)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:74)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:36)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:67)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at 

Re: conditional High Freq Terms in Lucene index

2012-03-31 Thread starz10de
I did as you mentioned and the problem still the same, I think the problem in
the highFrequentTerm part. There I see duplicate words in the produced high
frequent list. The comparison itself ok because I can see only terms belong
to document type A is added to the TermInfoQueue. However, the frequency
is not correctly counted for each term and also with some duplicate words in
the list. Does something wrong with TermDocs dok and dok.freq()?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/conditional-High-Freq-Terms-in-Lucene-index-tp3868066p3873567.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2202) Money/Currency FieldType

2012-03-31 Thread Commented

[ 
https://issues.apache.org/jira/browse/SOLR-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243209#comment-13243209
 ] 

Jan Høydahl commented on SOLR-2202:
---

Thanks for sorting this out!

 Money/Currency FieldType
 

 Key: SOLR-2202
 URL: https://issues.apache.org/jira/browse/SOLR-2202
 Project: Solr
  Issue Type: New Feature
  Components: Schema and Analysis
Affects Versions: 1.5
Reporter: Greg Fodor
Assignee: Jan Høydahl
 Fix For: 3.6, 4.0

 Attachments: SOLR-2022-solr-3.patch, 
 SOLR-2202-3x-stabilize-provider-interface.patch, 
 SOLR-2202-fix-NPE-if-no-tlong-fieldType.patch, SOLR-2202-lucene-1.patch, 
 SOLR-2202-no-fieldtype-deps.patch, SOLR-2202-solr-1.patch, 
 SOLR-2202-solr-10.patch, SOLR-2202-solr-2.patch, SOLR-2202-solr-4.patch, 
 SOLR-2202-solr-5.patch, SOLR-2202-solr-6.patch, SOLR-2202-solr-7.patch, 
 SOLR-2202-solr-8.patch, SOLR-2202-solr-9.patch, SOLR-2202.patch, 
 SOLR-2202.patch, SOLR-2202.patch, SOLR-2202.patch, SOLR-2202.patch, 
 SOLR-2202.patch, SOLR-2202.patch, SOLR-2202.patch


 Provides support for monetary values to Solr/Lucene with query-time currency 
 conversion. The following features are supported:
 - Point queries
 - Range quries
 - Sorting
 - Currency parsing by either currency code or symbol.
 - Symmetric  Asymmetric exchange rates. (Asymmetric exchange rates are 
 useful if there are fees associated with exchanging the currency.)
 At indexing time, money fields can be indexed in a native currency. For 
 example, if a product on an e-commerce site is listed in Euros, indexing the 
 price field as 1000,EUR will index it appropriately. By altering the 
 currency.xml file, the sorting and querying against Solr can take into 
 account fluctuations in currency exchange rates without having to re-index 
 the documents.
 The new money field type is a polyfield which indexes two fields, one which 
 contains the amount of the value and another which contains the currency code 
 or symbol. The currency metadata (names, symbols, codes, and exchange rates) 
 are expected to be in an xml file which is pointed to by the field type 
 declaration in the schema.xml.
 The current patch is factored such that Money utility functions and 
 configuration metadata lie in Lucene (see MoneyUtil and CurrencyConfig), 
 while the MoneyType and MoneyValueSource lie in Solr. This was meant to 
 mirror the work being done on the spacial field types.
 This patch will be getting used to power the international search 
 capabilities of the search engine at Etsy.
 Also see WIKI page: http://wiki.apache.org/solr/MoneyFieldType

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1052) Deprecate/Remove indexDefaults and mainIndex in favor of indexConfig in solrconfig.xml

2012-03-31 Thread Commented

[ 
https://issues.apache.org/jira/browse/SOLR-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243219#comment-13243219
 ] 

Jan Høydahl commented on SOLR-1052:
---

The patch uses {{lockTypesingle/lockType}} for all tests, which should be 
suitable for all platforms.

Tests pass for me, would love to have another pair of eyes on it too.


 Deprecate/Remove indexDefaults and mainIndex in favor of indexConfig in 
 solrconfig.xml
 

 Key: SOLR-1052
 URL: https://issues.apache.org/jira/browse/SOLR-1052
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Jan Høydahl
  Labels: solrconfig.xml
 Fix For: 3.6, 4.0

 Attachments: SOLR-1052-3x-fix-tests.patch, SOLR-1052-3x.patch, 
 SOLR-1052-3x.patch, SOLR-1052-3x.patch, SOLR-1052-3x.patch, SOLR-1052.patch


 Given that we now handle multiple cores via the solr.xml and the discussion 
 around indexDefaults and mainIndex at 
 http://www.lucidimagination.com/search/p:solr?q=mainIndex+vs.+indexDefaults
 We should deprecate old indexDefaults and mainIndex sections and only use 
 a new indexConfig section.
 3.6: Deprecation warning if old section used
 4.0: If LuceneMatchVersion before LUCENE_40 then warn (so old configs will 
 work), else fail fast

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3939) ClassCastException thrown in the map(String,int,TermVectorOffsetInfo[],int[]) method in org.apache.lucene.index.SortedTermVectorMapper

2012-03-31 Thread SHIN HWEI TAN (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243243#comment-13243243
 ] 

SHIN HWEI TAN commented on LUCENE-3939:
---

Thanks for the comment. 

Below is a test case that illustrate the problem: The second invocation to the 
method map throws ClassCastException although it is expected to run normally 
without any exception.
 
 org.apache.lucene.index.SortedTermVectorMapper var3 = new 
org.apache.lucene.index.SortedTermVectorMapper(false, 
false,(java.util.Comparator)null);
 var3.setExpectations(, 0, false, false);
 org.apache.lucene.index.TermVectorOffsetInfo[] var11 = new 
org.apache.lucene.index.TermVectorOffsetInfo[] { };
 var3.map(, (-1), var11, (int[])null);
 var3.map(*, (-1), (org.apache.lucene.index.TermVectorOffsetInfo[])null, 
(int[])null);

 ClassCastException thrown in the map(String,int,TermVectorOffsetInfo[],int[]) 
 method in org.apache.lucene.index.SortedTermVectorMapper
 --

 Key: LUCENE-3939
 URL: https://issues.apache.org/jira/browse/LUCENE-3939
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 3.0.2, 3.1, 3.4, 3.5
Reporter: SHIN HWEI TAN
   Original Estimate: 0.05h
  Remaining Estimate: 0.05h

 The method map in the SortedTermVectorMapper class does not check the 
 parameter term for the valid values. It throws ClassCastException when 
 called with a invalid string for the parameter term (i.e., var3.map(*, 
 (-1), null, null)). The exception thrown is due to an explict cast(i.e., 
 casting the return value of termToTVE.get(term) to type TermVectorEntry). 
 Suggested Fixes: Replace the beginning of the method body for the class 
 SortedTermVectorMapper by changing it like this:
 public void map(String term, int frequency, TermVectorOffsetInfo[] offsets, 
 int[] positions) {
   if(termToTVE.get(term) instanceof TermVectorEntry){
   TermVectorEntry entry = (TermVectorEntry) termToTVE.get(term);
   ...
   }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Uwe Schindler (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243265#comment-13243265
 ] 

Uwe Schindler commented on LUCENE-3738:
---

Mike, I was away from home and did not understand your comment, now its clear: 
You cannot reproduce the speedup from last patch neither can you see a 
difference with current patch.

I would suggest that I commit this now to trunk, we test a few nights and then 
commit it to 3.x (Robert needs to backport Ivy to 3.6, so we have some time).

I will commit this later before going to sleep, so we see results tomorrow.

 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-31 Thread Hoss Man (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243266#comment-13243266
 ] 

Hoss Man commented on LUCENE-3930:
--

I did some testing of the packages built using trunk (circa r1307608)...

* we don't ship solr's build.xml (or any of the sub-build.xml files) in the 
binary artifacts, and with these changes most of the new ivy.xml files are 
also excluded -- but for some reason these newly added files are showing up, we 
should probably figure out why and exclude them as well since they aren't 
usable and could easily people...
** ./example/example-DIH/ivy.xml
** ./example/example-DIH/build.xml
** ./example/ivy.xml
** ./example/build.xml
* the lib's for test-framework (ant, ant-junit, and junit) aren't being 
included in the lucene binary artifacts ... for the ant jars this might 
(test-framework doesn't actually have any run-time deps on anything in ant does 
it?) but it seems like hte junit jar should be included since including 
lucene-test-framework.jar in your classpath is useless w/o also including junit
* ant ivy-bootstrap followed by ant test using the lucene source package 
(lucene-4.0-SNAPSHOT-src.tgz) produces a build failure -- but this may have 
been a problem even before ivy (note the working dir and the final error)...

{noformat}
hossman@bester:~/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT$ ant test
...
[junit] Testsuite: org.apache.lucene.util.junitcompat.TestReproduceMessage
[junit] Tests run: 12, Failures: 0, Errors: 0, Time elapsed: 0.114 sec
[junit] 

test:

compile-lucene-core:

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[javac] Compiling 1 source file to 
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/classes/java

compile-core:

compile-test-framework:

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

compile-lucene-core:

compile-core:

compile-test:
 [echo] Building demo...

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

common.init:

compile-lucene-core:

contrib-build.init:

check-lucene-core-uptodate:

jar-lucene-core:

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

ivy-availability-check:

ivy-fail:

resolve:
[ivy:retrieve] :: loading settings :: url = 
jar:file:/home/hossman/.ant/lib/ivy-2.2.0.jar!/org/apache/ivy/core/settings/ivysettings.xml

init:

clover.setup:

clover.info:
 [echo] 
 [echo]   Clover not found. Code coverage reports disabled.
 [echo] 

clover:

common.compile-core:
[javac] Compiling 1 source file to 
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/classes/java

compile-core:

jar-core:
  [jar] Building jar: 
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build/core/lucene-core-4.0-SNAPSHOT.jar

init:

compile-test:
 [echo] Building demo...

check-analyzers-common-uptodate:

jar-analyzers-common:

BUILD FAILED
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/build.xml:487: The 
following error occurred while executing this line:
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/common-build.xml:1026:
 The following error occurred while executing this line:
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/contrib/contrib-build.xml:58:
 The following error occurred while executing this line:
/home/hossman/tmp/ivy-pck-testing/lu/src/lucene-4.0-SNAPSHOT/common-build.xml:551:
 Basedir /home/hossman/tmp/ivy-pck-testing/lu/src/modules/analysis/common does 
not exist

Total time: 5 minutes 10 seconds
{noformat}

...it's trying to reach back up out of the working directory into ../modules

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 

[jira] [Updated] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-31 Thread Hoss Man (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-3930:
-

Attachment: LUCENE-3930_includetestlibs_excludeexamplexml.patch

patch fixing the first two problems i mentioned above:
* categorically exclude build.xml and ivy.xml files from solr binary packages 
(to prevent the ones under example from being included)
* add parity to what files under test-framework get included in line with how 
contrib is treated (new patterns try to match some things that don't existing 
in test-framework, but i don't think that's bad -- future proofs us)

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930__ivy_bootstrap_target.patch, 
 LUCENE-3930_includetestlibs_excludeexamplexml.patch, 
 ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, 
 patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3738) Be consistent about negative vInt/vLong

2012-03-31 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243293#comment-13243293
 ] 

Michael McCandless commented on LUCENE-3738:


Sorry Uwe, that was exactly it: I don't know what to conclude from the perf 
runs anymore.

But +1 for your new patch: it ought to be better since the code is simpler.

 Be consistent about negative vInt/vLong
 ---

 Key: LUCENE-3738
 URL: https://issues.apache.org/jira/browse/LUCENE-3738
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Uwe Schindler
Priority: Blocker
 Fix For: 3.6, 4.0

 Attachments: ByteArrayDataInput.java.patch, 
 LUCENE-3738-improvement.patch, LUCENE-3738.patch, LUCENE-3738.patch, 
 LUCENE-3738.patch, LUCENE-3738.patch, LUCENE-3738.patch


 Today, write/readVInt allows a negative int, in that it will encode and 
 decode correctly, just horribly inefficiently (5 bytes).
 However, read/writeVLong fails (trips an assert).
 I'd prefer that both vInt/vLong trip an assert if you ever try to write a 
 negative number... it's badly trappy today.  But, unfortunately, we sometimes 
 rely on this... had we had this assert in 'since the beginning' we could have 
 avoided that.
 So, if we can't add that assert in today, I think we should at least fix 
 readVLong to handle negative longs... but then you quietly spend 9 bytes 
 (even more trappy!).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3940) When Japanese (Kuromoji) tokenizer removes a punctuation token it should leave a hole

2012-03-31 Thread Michael McCandless (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243299#comment-13243299
 ] 

Michael McCandless commented on LUCENE-3940:


bq.  StandardTokenizer doesnt leave holes when it drops punctuation,

But is that really good?

This means a PhraseQuery will match across end-of-sentence (.),
semicolon, colon, comma, etc. (English examples..).

I think tokenizers should throw away as little information as
possible... we can always filter out such tokens in a later stage?

For example, if a tokenizer created punct tokens (instead of silently
discarding them), other token filters could make use of them in the
mean time, eg a synonym rule for u.s.a. - usa or maybe a dedicated
English acronyms filter.  We could then later filter them out, even
not leaving holes, and have the same behavior that we have now?

Are there non-English examples where you would want the PhraseQuery to
match over punctuation...?  EG, for Japanese, I assume we don't want
PhraseQuery applying across periods/commas, like it will now? (Not
sure about middle dot...?  Others...?).

 When Japanese (Kuromoji) tokenizer removes a punctuation token it should 
 leave a hole
 -

 Key: LUCENE-3940
 URL: https://issues.apache.org/jira/browse/LUCENE-3940
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.0

 Attachments: LUCENE-3940.patch, LUCENE-3940.patch


 I modified BaseTokenStreamTestCase to assert that the start/end
 offsets match for graph (posLen  1) tokens, and this caught a bug in
 Kuromoji when the decompounding of a compound token has a punctuation
 token that's dropped.
 In this case we should leave hole(s) so that the graph is intact, ie,
 the graph should look the same as if the punctuation tokens were not
 initially removed, but then a StopFilter had removed them.
 This also affects tokens that have no compound over them, ie we fail
 to leave a hole today when we remove the punctuation tokens.
 I'm not sure this is serious enough to warrant fixing in 3.6 at the
 last minute...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3930) nuke jars from source tree and use ivy

2012-03-31 Thread Commented

[ 
https://issues.apache.org/jira/browse/LUCENE-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243598#comment-13243598
 ] 

Jan Høydahl commented on LUCENE-3930:
-

We have a 7Mb jar which is included in the binary distro twice. Any way to get 
rid of one?
{code}
./contrib/analysis-extras/lib/icu4j-4.8.1.1.jar
./contrib/extraction/lib/icu4j-4.8.1.1.jar
{code}

Also, from what I can see, {{solr/contrib/extraction/lib/xml-apis-1.0.b2.jar}} 
dependency is redundant - tests pass without it
See https://issues.apache.org/jira/browse/TIKA-412 and 
https://issues.apache.org/jira/browse/LUCENE-2961

 nuke jars from source tree and use ivy
 --

 Key: LUCENE-3930
 URL: https://issues.apache.org/jira/browse/LUCENE-3930
 Project: Lucene - Java
  Issue Type: Task
  Components: general/build
Reporter: Robert Muir
Assignee: Robert Muir
Priority: Blocker
 Fix For: 3.6

 Attachments: LUCENE-3930-skip-sources-javadoc.patch, 
 LUCENE-3930-solr-example.patch, LUCENE-3930-solr-example.patch, 
 LUCENE-3930.patch, LUCENE-3930.patch, LUCENE-3930.patch, 
 LUCENE-3930__ivy_bootstrap_target.patch, 
 LUCENE-3930_includetestlibs_excludeexamplexml.patch, 
 ant_-verbose_clean_test.out.txt, noggit-commons-csv.patch, 
 patch-jetty-build.patch


 As mentioned on the ML thread: switch jars to ivy mechanism?.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-3254) Upgrade Solr to Tika 1.1

2012-03-31 Thread Assigned

 [ 
https://issues.apache.org/jira/browse/SOLR-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-3254:
-

Assignee: Jan Høydahl

 Upgrade Solr to Tika 1.1
 

 Key: SOLR-3254
 URL: https://issues.apache.org/jira/browse/SOLR-3254
 Project: Solr
  Issue Type: Improvement
  Components: contrib - LangId, contrib - Solr Cell (Tika extraction)
Reporter: Jan Høydahl
Assignee: Jan Høydahl
 Fix For: 4.0

 Attachments: SOLR-3254.patch


 Tika 1.1 is being released soon. It features some new parsers, ability to 
 extract text from password protected PDFs and office docs, and several bug 
 fixes. See 
 http://people.apache.org/~mattmann/apache-tika-1.1/rc1/CHANGES-1.1.txt
 We should upgrade as soon as it is released.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1929) Index encrypted pdf files

2012-03-31 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-1929:
--

Fix Version/s: 4.0
 Assignee: Jan Høydahl

 Index encrypted pdf files
 -

 Key: SOLR-1929
 URL: https://issues.apache.org/jira/browse/SOLR-1929
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Yiannis Pericleous
Assignee: Jan Høydahl
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-1929.patch


 SolrCell is not able to index encrypted pdfs.
 This is easily done by supplying the password in the metadata passed on to 
 tika

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3939) ClassCastException thrown in the map(String,int,TermVectorOffsetInfo[],int[]) method in org.apache.lucene.index.SortedTermVectorMapper

2012-03-31 Thread SHIN HWEI TAN (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243607#comment-13243607
 ] 

SHIN HWEI TAN commented on LUCENE-3939:
---

Thanks for the quick response.

I don't think that passing null as Comparator is the problem. For example, if 
the first invocation of the method map is commented out(as below), then there 
is no exception thrown. In this case, the Comparator is still null.

   org.apache.lucene.index.SortedTermVectorMapper var3 = new
org.apache.lucene.index.SortedTermVectorMapper(false, 
false,(java.util.Comparator)null);
   var3.setExpectations(, 0, false, false);
   var3.map(*:, (-1), (org.apache.lucene.index.TermVectorOffsetInfo[])null, 
(int[])null);

 ClassCastException thrown in the map(String,int,TermVectorOffsetInfo[],int[]) 
 method in org.apache.lucene.index.SortedTermVectorMapper
 --

 Key: LUCENE-3939
 URL: https://issues.apache.org/jira/browse/LUCENE-3939
 Project: Lucene - Java
  Issue Type: Bug
  Components: core/index
Affects Versions: 3.0.2, 3.1, 3.4, 3.5
Reporter: SHIN HWEI TAN
   Original Estimate: 0.05h
  Remaining Estimate: 0.05h

 The method map in the SortedTermVectorMapper class does not check the 
 parameter term for the valid values. It throws ClassCastException when 
 called with a invalid string for the parameter term (i.e., var3.map(*, 
 (-1), null, null)). The exception thrown is due to an explict cast(i.e., 
 casting the return value of termToTVE.get(term) to type TermVectorEntry). 
 Suggested Fixes: Replace the beginning of the method body for the class 
 SortedTermVectorMapper by changing it like this:
 public void map(String term, int frequency, TermVectorOffsetInfo[] offsets, 
 int[] positions) {
   if(termToTVE.get(term) instanceof TermVectorEntry){
   TermVectorEntry entry = (TermVectorEntry) termToTVE.get(term);
   ...
   }
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-1856) In Solr Cell, literals should override Tika-parsed values

2012-03-31 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-1856:
--

Affects Version/s: (was: 1.4)
Fix Version/s: 4.0
 Assignee: Jan Høydahl

 In Solr Cell, literals should override Tika-parsed values
 -

 Key: SOLR-1856
 URL: https://issues.apache.org/jira/browse/SOLR-1856
 Project: Solr
  Issue Type: Improvement
  Components: contrib - Solr Cell (Tika extraction)
Reporter: Chris Harris
Assignee: Jan Høydahl
 Fix For: 4.0

 Attachments: SOLR-1856.patch


 I propose that ExtractingRequestHandler / SolrCell literals should take 
 precedence over Tika-parsed metadata in all situations, including where 
 multiValued=true. (Compare SOLR-1633?)
 My personal motivation is that I have several fields (e.g. title, date) 
 where my own metadata is much superior to what Tika offers, and I want to 
 throw those Tika values away. (I actually wouldn't mind throwing away _all_ 
 Tika-parsed values, but let's set that aside.) SOLR-1634 is one potential 
 approach to this, but the fix here might be simpler.
 I'll attach a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2649) MM ignored in edismax queries with operators

2012-03-31 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-2649:
--

Affects Version/s: (was: 3.3)
Fix Version/s: 4.0

 MM ignored in edismax queries with operators
 

 Key: SOLR-2649
 URL: https://issues.apache.org/jira/browse/SOLR-2649
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Magnus Bergmark
Priority: Minor
 Fix For: 4.0


 Hypothetical scenario:
   1. User searches for stocks oil gold with MM set to 50%
   2. User adds -stockings to the query: stocks oil gold -stockings
   3. User gets no hits since MM was ignored and all terms where AND-ed 
 together
 The behavior seems to be intentional, although the reason why is never 
 explained:
   // For correct lucene queries, turn off mm processing if there
   // were explicit operators (except for AND).
   boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; 
 (lines 232-234 taken from 
 tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java)
 This makes edismax unsuitable as an replacement to dismax; mm is one of the 
 primary features of dismax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2366) Facet Range Gaps

2012-03-31 Thread Commented

[ 
https://issues.apache.org/jira/browse/SOLR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243611#comment-13243611
 ] 

Jan Høydahl commented on SOLR-2366:
---

Note to self: catch up on this again :)

 Facet Range Gaps
 

 Key: SOLR-2366
 URL: https://issues.apache.org/jira/browse/SOLR-2366
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2366.patch, SOLR-2366.patch


 There really is no reason why the range gap for date and numeric faceting 
 needs to be evenly spaced.  For instance, if and when SOLR-1581 is completed 
 and one were doing spatial distance calculations, one could facet by function 
 into 3 different sized buckets: walking distance (0-5KM), driving distance 
 (5KM-150KM) and everything else (150KM+), for instance.  We should be able to 
 quantize the results into arbitrarily sized buckets.
 (Original syntax proposal removed, see discussion for concrete syntax)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Closed] (SOLR-1895) ManifoldCF SearchComponent plugin for enforcing ManifoldCF security at search time

2012-03-31 Thread Closed

 [ 
https://issues.apache.org/jira/browse/SOLR-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl closed SOLR-1895.
-

   Resolution: Won't Fix
Fix Version/s: (was: 4.0)

Closing this as Won't fix since the fix is checked in to MCF's source tree

 ManifoldCF SearchComponent plugin for enforcing ManifoldCF security at search 
 time
 --

 Key: SOLR-1895
 URL: https://issues.apache.org/jira/browse/SOLR-1895
 Project: Solr
  Issue Type: New Feature
  Components: SearchComponents - other
Reporter: Karl Wright
  Labels: document, security, solr
 Attachments: LCFSecurityFilter.java, LCFSecurityFilter.java, 
 LCFSecurityFilter.java, LCFSecurityFilter.java, SOLR-1895-queries.patch, 
 SOLR-1895-queries.patch, SOLR-1895-queries.patch, SOLR-1895-queries.patch, 
 SOLR-1895-queries.patch, SOLR-1895-service-plugin.patch, 
 SOLR-1895-service-plugin.patch, SOLR-1895.patch, SOLR-1895.patch, 
 SOLR-1895.patch, SOLR-1895.patch, SOLR-1895.patch, SOLR-1895.patch


 I've written an LCF SearchComponent which filters returned results based on 
 access tokens provided by LCF's authority service.  The component requires 
 you to configure the appropriate authority service URL base, e.g.:
   !-- LCF document security enforcement component --
   searchComponent name=lcfSecurity class=LCFSecurityFilter
 str 
 name=AuthorityServiceBaseURLhttp://localhost:8080/lcf-authority-service/str
   /searchComponent
 Also required are the following schema.xml additions:
!-- Security fields --
field name=allow_token_document type=string indexed=true 
 stored=false multiValued=true/
field name=deny_token_document type=string indexed=true 
 stored=false multiValued=true/
field name=allow_token_share type=string indexed=true 
 stored=false multiValued=true/
field name=deny_token_share type=string indexed=true stored=false 
 multiValued=true/
 Finally, to tie it into the standard request handler, it seems to need to run 
 last:
   requestHandler name=standard class=solr.SearchHandler default=true
 arr name=last-components
   strlcfSecurity/str
 /arr
 ...
 I have not set a package for this code.  Nor have I been able to get it 
 reviewed by someone as conversant with Solr as I would prefer.  It is my 
 hope, however, that this module will become part of the standard Solr 1.5 
 suite of search components, since that would tie it in with LCF nicely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1758) schema definition for configuration files (validation, XSD)

2012-03-31 Thread Commented

[ 
https://issues.apache.org/jira/browse/SOLR-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243613#comment-13243613
 ] 

Jan Høydahl commented on SOLR-1758:
---

Mike, do you have an updated patch for this? What do you think about holding 
the xsd's inside the war?

 schema definition for configuration files (validation, XSD)
 ---

 Key: SOLR-1758
 URL: https://issues.apache.org/jira/browse/SOLR-1758
 Project: Solr
  Issue Type: New Feature
Reporter: Jorg Heymans
  Labels: configuration, schema.xml, solrconfig.xml, validation, 
 xsd
 Fix For: 4.0

 Attachments: config-validation-20110523.patch


 It is too easy to make configuration errors in Solr without getting warnings. 
 We should explore ways of validation configurations. See mailing list 
 discussion at http://search-lucene.com/m/h6xKf1EShE6

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-2934) Problem with Solr Hunspell with French Dictionary

2012-03-31 Thread Updated

 [ 
https://issues.apache.org/jira/browse/SOLR-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-2934:
--

Fix Version/s: 4.0

 Problem with Solr Hunspell with French Dictionary
 -

 Key: SOLR-2934
 URL: https://issues.apache.org/jira/browse/SOLR-2934
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 3.5
 Environment: Windows 7
Reporter: Nathan Castelein
Assignee: Chris Male
 Fix For: 4.0

 Attachments: en_GB.aff, en_GB.dic


 I'm trying to add the HunspellStemFilterFactory to my Solr project. 
 I'm trying this on a fresh new download of Solr 3.5.
 I downloaded french dictionary here (found it from here): 
 http://www.dicollecte.org/download/fr/hunspell-fr-moderne-v4.3.zip
 But when I start Solr and go to the Solr Analysis, an error occurs in Solr.
 Is there the trace : 
 java.lang.RuntimeException: Unable to load hunspell data! 
 [dictionary=en_GB.dic,affix=fr-moderne.aff]
   at 
 org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:82)
   at 
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:546)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:126)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
   at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
   at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
   at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
   at 
 org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
   at 
 org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
   at 
 org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
   at 
 org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
   at org.mortbay.jetty.Server.doStart(Server.java:224)
   at 
 org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
   at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
   at java.lang.reflect.Method.invoke(Unknown Source)
   at org.mortbay.start.Main.invokeMain(Main.java:194)
   at org.mortbay.start.Main.start(Main.java:534)
   at org.mortbay.start.Main.start(Main.java:441)
   at org.mortbay.start.Main.main(Main.java:119)
 Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
 range: 3
   at java.lang.String.charAt(Unknown Source)
   at 
 org.apache.lucene.analysis.hunspell.HunspellDictionary$DoubleASCIIFlagParsingStrategy.parseFlags(HunspellDictionary.java:382)
   at 
 org.apache.lucene.analysis.hunspell.HunspellDictionary.parseAffix(HunspellDictionary.java:165)
   at 
 org.apache.lucene.analysis.hunspell.HunspellDictionary.readAffixFile(HunspellDictionary.java:121)
   at 
 org.apache.lucene.analysis.hunspell.HunspellDictionary.init(HunspellDictionary.java:64)
   at 
 org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:46)
 I can't find where the problem is. It seems like my dictionary isn't well 
 written for hunspell, but I tried with two different dictionaries, and I had 
 the same problem.
 I also tried with an english dictionary, and ... it works !
 So I think that my french dictionary is wrong for hunspell, but I don't know 
 why ...
 Can you help me ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Closed] (SOLR-435) QParser must validate existence/absence of q parameter

2012-03-31 Thread David Smiley (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley closed SOLR-435.
-

Resolution: Fixed

Re-committed to 4.x, and I moved the CHANGES.txt from the v4 to v3 section on 
both branches.  Closing issue.

 QParser must validate existence/absence of q parameter
 

 Key: SOLR-435
 URL: https://issues.apache.org/jira/browse/SOLR-435
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 1.3
Reporter: Ryan McKinley
Assignee: David Smiley
 Fix For: 3.6, 4.0

 Attachments: 
 SOLR-2001_3x_backport_with_empty_string_check_and_test.patch, SOLR-435.patch, 
 SOLR-435_3x_consistent_errors.patch, SOLR-435_q_defaults_to_all-docs.patch


 Each QParser should check if q exists or not.  For some it will be required 
 others not.
 currently it throws a null pointer:
 {code}
 java.lang.NullPointerException
   at org.apache.solr.common.util.StrUtils.splitSmart(StrUtils.java:36)
   at 
 org.apache.solr.search.OldLuceneQParser.parse(LuceneQParserPlugin.java:104)
   at org.apache.solr.search.QParser.getQuery(QParser.java:80)
   at 
 org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:67)
   at 
 org.apache.solr.handler.SearchHandler.handleRequestBody(SearchHandler.java:150)
 ...
 {code}
 see:
 http://www.nabble.com/query-parsing-error-to14124285.html#a14140108

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-tests-only-trunk - Build # 12928 - Failure

2012-03-31 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/12928/

2 tests failed.
REGRESSION:  org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch

Error Message:
java.lang.AssertionError: 
org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch: Insane 
FieldCache usage(s) found expected:0 but was:1

Stack Trace:
java.lang.RuntimeException: java.lang.AssertionError: 
org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch: Insane 
FieldCache usage(s) found expected:0 but was:1
at 
org.apache.lucene.util.LuceneTestCase.tearDownInternal(LuceneTestCase.java:819)
at 
org.apache.lucene.util.LuceneTestCase.access$900(LuceneTestCase.java:138)
at 
org.apache.lucene.util.LuceneTestCase$InternalSetupTeardownRule$1.evaluate(LuceneTestCase.java:676)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at 
org.apache.lucene.util.LuceneTestCase$TestResultInterceptorRule$1.evaluate(LuceneTestCase.java:591)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.LuceneTestCase$SaveThreadAndTestNameRule$1.evaluate(LuceneTestCase.java:642)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:164)
at 
org.apache.lucene.util.LuceneTestCaseRunner.runChild(LuceneTestCaseRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
at 
org.apache.lucene.util.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:63)
at 
org.apache.lucene.util.UncaughtExceptionsRule$1.evaluate(UncaughtExceptionsRule.java:75)
at 
org.apache.lucene.util.StoreClassNameRule$1.evaluate(StoreClassNameRule.java:38)
at 
org.apache.lucene.util.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:69)
at org.junit.rules.RunRules.evaluate(RunRules.java:18)
at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:743)
Caused by: java.lang.AssertionError: 
org.apache.solr.cloud.BasicDistributedZkTest.testDistribSearch: Insane 
FieldCache usage(s) found expected:0 but was:1
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:930)
at 
org.apache.lucene.util.LuceneTestCase.tearDownInternal(LuceneTestCase.java:809)
... 28 more


FAILED:  junit.framework.TestSuite.org.apache.solr.cloud.BasicDistributedZkTest

Error Message:
ERROR: SolrIndexSearcher opens=93 closes=91

Stack Trace:
junit.framework.AssertionFailedError: ERROR: SolrIndexSearcher opens=93 
closes=91
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner$3.addError(JUnitTestRunner.java:974)
at junit.framework.TestResult.addError(TestResult.java:38)
at 
junit.framework.JUnit4TestAdapterCache$1.testFailure(JUnit4TestAdapterCache.java:51)
at 
org.junit.runner.notification.RunNotifier$4.notifyListener(RunNotifier.java:100)
at 
org.junit.runner.notification.RunNotifier$SafeNotifier.run(RunNotifier.java:41)
at 
org.junit.runner.notification.RunNotifier.fireTestFailure(RunNotifier.java:97)
at 
org.junit.internal.runners.model.EachTestNotifier.addFailure(EachTestNotifier.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:306)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)