date:20110503


[ 
https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028105#comment-13028105
 ] 

Uwe Schindler commented on LUCENE-3058:
---

bq. null passes all instanceofs

Definitely NOT! 
[http://stackoverflow.com/questions/2950319/is-null-check-needed-before-calling-instanceof]

 FST should allow more than one output for the same input
 

 Key: LUCENE-3058
 URL: https://issues.apache.org/jira/browse/LUCENE-3058
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3058.patch, LUCENE-3058.patch


 For the block tree terms dict, it turns out I need this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3058) FST should allow more than one output for the same input

2011-05-03 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028108#comment-13028108
 ] 

Dawid Weiss commented on LUCENE-3058:
-

Handslap! And this is why you should always refresh your memory before posting 
something that lasts for millenia... Crawling back to my cave right now.

 FST should allow more than one output for the same input
 

 Key: LUCENE-3058
 URL: https://issues.apache.org/jira/browse/LUCENE-3058
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3058.patch, LUCENE-3058.patch


 For the block tree terms dict, it turns out I need this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (LUCENE-3058) FST should allow more than one output for the same input

2011-05-03 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028104#comment-13028104
 ] 

Dawid Weiss edited comment on LUCENE-3058 at 5/3/11 8:35 AM:
-

Looks good to me. One note: possible NPE here (-null passes all instanceofs-):
{code}
+@Override
+public boolean equals(Object _other) {
+  if (_other instanceof TwoLongs) {
+final TwoLongs other = (TwoLongs) _other;
+return first == other.first  second == other.second;
+  } else {
+return false;
+  }
+}
{code}

  was (Author: dweiss):
Looks good to me. One note: possible NPE here (null passes all instanceofs):
{code}
+@Override
+public boolean equals(Object _other) {
+  if (_other instanceof TwoLongs) {
+final TwoLongs other = (TwoLongs) _other;
+return first == other.first  second == other.second;
+  } else {
+return false;
+  }
+}
{code}
  
 FST should allow more than one output for the same input
 

 Key: LUCENE-3058
 URL: https://issues.apache.org/jira/browse/LUCENE-3058
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3058.patch, LUCENE-3058.patch


 For the block tree terms dict, it turns out I need this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3058) FST should allow more than one output for the same input


[ 
https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028109#comment-13028109
 ] 

Uwe Schindler commented on LUCENE-3058:
---

:-) It always confuses me, too. But if you think more about it, it makes sense 
to return false. But it's the same always for me: Whenever I write equals() 
methods, this question pops up. But now I mostly copy code like the one above 
from other classes. But you have to note: The above equals() code is only 100% 
suitable for final classes, else it could happen that a subclass that extends 
some fields is equal. But thats more a theoretical discussion. E.g. Lucene's 
Queries always check this.getClass()==other.getClass().

 FST should allow more than one output for the same input
 

 Key: LUCENE-3058
 URL: https://issues.apache.org/jira/browse/LUCENE-3058
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.0

 Attachments: LUCENE-3058.patch, LUCENE-3058.patch


 For the block tree terms dict, it turns out I need this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: MergePolicy Thresholds

2011-05-03 Thread Shai Erera

Hi

I looked into porting it to 3x, and prepared the attached patch. It only
contains the new TieredMP and Test, as well as the necessary changes to
LuceneTestCase and IndexWriter. I guess you can start with it (even just the
MP and IW changes) to test it on your indexes.

Mike, I saw that there were many more changes, as part of LUCENE-1076, done
to the code. In particular, this MP is now the default (on trunk), so I
guess many changes (to tests) were needed because of that. Do you remember,
if apart from the changes I've included in the patch, other important
changes w.r.t. this code?

As we won't change the default MP on 3x, I'm guessing I don't need to port
all the changes to 3x.

Shai

On Mon, May 2, 2011 at 9:41 PM, Burton-West, Tom tburt...@umich.edu wrote:

 Hi Shai and Mike,

 Testing the TieredMP on our large indexes has been on my todo list since I
 read Mikes blog post

 http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
 .

 If you port it to the 3.x branch Shai, I'll be more than happy to test it
 with our very large (300GB+) indexes.  Besides being able to set the max
 merged segment size, I'm especially interested in using the
  maxSegmentsPerTier parameter.

 From Mike's blog post:
  ...maxSegmentsPerTier that lets you set the allowed width (number of
 segments) of each stair in the staircase. This is nice because it decouples
 how many segments to merge at a time from how wide the staircase can be.

 Tom Burton-West
 http://www.hathitrust.org/blogs/large-scale-search

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Monday, May 02, 2011 2:19 PM
 To: dev@lucene.apache.org
 Subject: Re: MergePolicy Thresholds

 I think it should be an easy port...

 Mike

 http://blog.mikemccandless.com

 On Mon, May 2, 2011 at 2:16 PM, Shai Erera ser...@gmail.com wrote:
  Thanks Mike. I'll take a look at TieredMP. Does it depend on trunk in any
  way, or do you think it can easily be ported to 3x?
  Shai
 


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




tieredmp.patch
Description: Binary data

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3054) SorterTemplate.quickSort stack overflows on broken comparators that produce only few disticnt values in large arrays


[ 
https://issues.apache.org/jira/browse/LUCENE-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028125#comment-13028125
 ] 

Michael McCandless commented on LUCENE-3054:


Patch looks good!  I like the 2*log_2(N) dynamic cutover; this means we can 
tolerate somewhat lopsided QS recursion and remain using QS.

 SorterTemplate.quickSort stack overflows on broken comparators that produce 
 only few disticnt values in large arrays
 

 Key: LUCENE-3054
 URL: https://issues.apache.org/jira/browse/LUCENE-3054
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.1
Reporter: Robert Muir
Assignee: Uwe Schindler
Priority: Critical
 Fix For: 3.1.1, 3.2, 4.0

 Attachments: LUCENE-3054-dynamic.patch, 
 LUCENE-3054-stackoverflow.patch, LUCENE-3054.patch, LUCENE-3054.patch, 
 LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch


 Looking at Otis's sort problem on the mailing list, he said:
 {noformat}
 * looked for other places where this call is made - found it in
 MultiPhraseQuery$MultiPhraseWeight and changed that call from
 ArrayUtil.quickSort to ArrayUtil.mergeSort
 * now we no longer see SorterTemplate.quickSort in deep recursion when we do a
 thread dump
 {noformat}
 I thought this was interesting because PostingsAndFreq's comparator
 looks like it needs a tiebreaker.
 I think in our sorts we should add some asserts to try to catch some of these 
 broken comparators.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: MergePolicy Thresholds

Looks good Shai!

Comments below too:

On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote:
 Hi

 I looked into porting it to 3x, and prepared the attached patch. It only
 contains the new TieredMP and Test, as well as the necessary changes to
 LuceneTestCase and IndexWriter. I guess you can start with it (even just the
 MP and IW changes) to test it on your indexes.

 Mike, I saw that there were many more changes, as part of LUCENE-1076, done
 to the code. In particular, this MP is now the default (on trunk), so I
 guess many changes (to tests) were needed because of that. Do you remember,
 if apart from the changes I've included in the patch, other important
 changes w.r.t. this code?

The only other changes I can think of were some verbosity improvements
to IndexWriter, to support the python script that can make a merge
movie from an infoStream output; but that can wait for when I
back-port to 3.x...

 As we won't change the default MP on 3x, I'm guessing I don't need to port
 all the changes to 3x.

Right, I think.

Mike

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: MergePolicy Thresholds

2011-05-03 Thread Shai Erera

Mike, if you want, I can back-port it, as I've already started this when
preparing the patch.

I noticed that you added a throws IOE to IW.setInfoStream -- is it ok on
3x too? It'll be a backwards change.

Maybe we should iterate on the issue? I can reopen.

Shai

On Tue, May 3, 2011 at 12:36 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Looks good Shai!

 Comments below too:

 On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote:
  Hi
 
  I looked into porting it to 3x, and prepared the attached patch. It only
  contains the new TieredMP and Test, as well as the necessary changes to
  LuceneTestCase and IndexWriter. I guess you can start with it (even just
 the
  MP and IW changes) to test it on your indexes.
 
  Mike, I saw that there were many more changes, as part of LUCENE-1076,
 done
  to the code. In particular, this MP is now the default (on trunk), so I
  guess many changes (to tests) were needed because of that. Do you
 remember,
  if apart from the changes I've included in the patch, other important
  changes w.r.t. this code?

 The only other changes I can think of were some verbosity improvements
 to IndexWriter, to support the python script that can make a merge
 movie from an infoStream output; but that can wait for when I
 back-port to 3.x...

  As we won't change the default MP on 3x, I'm guessing I don't need to
 port
  all the changes to 3x.

 Right, I think.

 Mike

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

Re: MergePolicy Thresholds

That'd be great, thanks :)

Yes, let's iterate on the issue!  But: it should still be open, I hope
(I didn't mean to close it yet, since it's not back ported)...

Mike

http://blog.mikemccandless.com

On Tue, May 3, 2011 at 5:51 AM, Shai Erera ser...@gmail.com wrote:
 Mike, if you want, I can back-port it, as I've already started this when
 preparing the patch.

 I noticed that you added a throws IOE to IW.setInfoStream -- is it ok on
 3x too? It'll be a backwards change.

 Maybe we should iterate on the issue? I can reopen.

 Shai

 On Tue, May 3, 2011 at 12:36 PM, Michael McCandless
 luc...@mikemccandless.com wrote:

 Looks good Shai!

 Comments below too:

 On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote:
  Hi
 
  I looked into porting it to 3x, and prepared the attached patch. It only
  contains the new TieredMP and Test, as well as the necessary changes to
  LuceneTestCase and IndexWriter. I guess you can start with it (even just
  the
  MP and IW changes) to test it on your indexes.
 
  Mike, I saw that there were many more changes, as part of LUCENE-1076,
  done
  to the code. In particular, this MP is now the default (on trunk), so I
  guess many changes (to tests) were needed because of that. Do you
  remember,
  if apart from the changes I've included in the patch, other important
  changes w.r.t. this code?

 The only other changes I can think of were some verbosity improvements
 to IndexWriter, to support the python script that can make a merge
 movie from an infoStream output; but that can wait for when I
 back-port to 3.x...

  As we won't change the default MP on 3x, I'm guessing I don't need to
  port
  all the changes to 3x.

 Right, I think.

 Mike

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR

Do not include slf4j-jdk14 jar in WAR
-

 Key: SOLR-2487
 URL: https://issues.apache.org/jira/browse/SOLR-2487
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.2, 4.0
Reporter: Jan Høydahl


I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help 
newbies get up and running. But I find myself re-packaging the war for every 
customer when adapting to their choice of logger framework, which is 
counter-productive.

It would be sufficient to have the jdk-logging binding in example/lib to let 
the example and tutorial still work OOTB but as soon as you deploy solr.war to 
production you're forced to explicitly decide what logging to use.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2488) README.TXT mixes Unix and Windows path styles

README.TXT mixes Unix and Windows path styles
-

 Key: SOLR-2488
 URL: https://issues.apache.org/jira/browse/SOLR-2488
 Project: Solr
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.2, 4.0
Reporter: Jan Høydahl
Priority: Minor


README.TXT mixes Unix- and Windows-style syntaxes without further comments.
Propose to change e.g. %JAVA_HOME%\bin - $JAVA_HOME/bin to be consistent and 
add a comment about Windows elsewhere

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2489) Remove old lucene.apache.org/solr/who page

Remove old lucene.apache.org/solr/who page
--

 Key: SOLR-2489
 URL: https://issues.apache.org/jira/browse/SOLR-2489
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.1, 3.2
Reporter: Jan Høydahl
Priority: Minor


In the distribution, docs/who.html is old - refers to the old Solr committers 
list at http://lucene.apache.org/solr/who

Fix would be to simply delete the old page

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #111: POMs out of sync

2011-05-03 Thread Apache Jenkins Server

Build: https://builds.apache.org/hudson/job/Lucene-Solr-Maven-3.x/111/

No tests ran.

Build Log (for compile errors):
[...truncated 13339 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-1076) Allow MergePolicy to select non-contiguous merges

2011-05-03 Thread Shai Erera (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-1076:
---

Attachment: LUCENE-1076-3x.patch

Patch against 3x. This is not ready to commit yet, as many tests fail on 
exceptions like this:

{noformat}
[junit] java.lang.IndexOutOfBoundsException
[junit] at java.util.AbstractList.subList(AbstractList.java:763)
[junit] at java.util.Vector.subList(Vector.java:975)
[junit] at 
org.apache.lucene.index.IndexWriter.commitMerge(IndexWriter.java:3550)
[junit] at 
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4057)
[junit] at 
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3631)
{noformat}

Mike says there was  an earlier commit (handled how deletes are flushed) that 
is a dependency of that, and that I can continue only he back-ports that.

In the meantime, I've fixed tests that assumed LogMP (for setting compound and 
mergeFactor) by adding LTC.setUseCompoundFile and LTC.setMergeFactor as utility 
methods.

Will continue after Mike back-ports the dependencies.

 Allow MergePolicy to select non-contiguous merges
 -

 Key: LUCENE-1076
 URL: https://issues.apache.org/jira/browse/LUCENE-1076
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Index
Affects Versions: 2.3
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-1076-3x.patch, LUCENE-1076.patch, 
 LUCENE-1076.patch, LUCENE-1076.patch


 I started work on this but with LUCENE-1044 I won't make much progress
 on it for a while, so I want to checkpoint my current state/patch.
 For backwards compatibility we must leave the default MergePolicy as
 selecting contiguous merges.  This is necessary because some
 applications rely on temporal monotonicity of doc IDs, which means
 even though merges can re-number documents, the renumbering will
 always reflect the order in which the documents were added to the
 index.
 Still, for those apps that do not rely on this, we should offer a
 MergePolicy that is free to select the best merges regardless of
 whether they are continuguous.  This requires fixing IndexWriter to
 accept such a merge, and, fixing LogMergePolicy to optionally allow
 it the freedom to do so.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2488) README.TXT mixes Unix and Windows path styles


 [ 
https://issues.apache.org/jira/browse/SOLR-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-2488:
--

Attachment: SOLR-2488.patch

Proposed changes

 README.TXT mixes Unix and Windows path styles
 -

 Key: SOLR-2488
 URL: https://issues.apache.org/jira/browse/SOLR-2488
 Project: Solr
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.2, 4.0
Reporter: Jan Høydahl
Priority: Minor
 Attachments: SOLR-2488.patch


 README.TXT mixes Unix- and Windows-style syntaxes without further comments.
 Propose to change e.g. %JAVA_HOME%\bin - $JAVA_HOME/bin to be consistent and 
 add a comment about Windows elsewhere

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: (LUCENE-3058) FST should allow more than one output for the same input

2011-05-03 Thread Dawid Weiss

I usually do an explicit check for nulls and that's why I allowed myself to
bring the issue up. It's similar to operator priorities -- I just like to
have explicit brackets instead of relying on my degenerating memory... As
for sorting, I don't like to rely on the default hashCode/equals exactly for
the reasons you mentioned and prefer explicit comparators. It's really a
pity there is no full hashcode/equals delegation model in java util
collections, it would be a nice addition.

On Tue, May 3, 2011 at 10:41 AM, Uwe Schindler (JIRA) j...@apache.orgwrote:

[
https://issues.apache.org/jira/browse/LUCENE-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028109#comment-13028109]

Uwe Schindler commented on LUCENE-3058:
---

:-) It always confuses me, too. But if you think more about it, it makes
sense to return false. But it's the same always for me: Whenever I write
equals() methods, this question pops up. But now I mostly copy code like the
one above from other classes. But you have to note: The above equals() code
is only 100% suitable for final classes, else it could happen that a
subclass that extends some fields is equal. But thats more a theoretical
discussion. E.g. Lucene's Queries always check
this.getClass()==other.getClass().

FST should allow more than one output for the same input

Key: LUCENE-3058
URL: https://issues.apache.org/jira/browse/LUCENE-3058
Project: Lucene - Java
Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: 4.0

Attachments: LUCENE-3058.patch, LUCENE-3058.patch

For the block tree terms dict, it turns out I need this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3055) LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers

2011-05-03 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028189#comment-13028189
]

Robert Muir commented on LUCENE-3055:
-

{quote}
Also, if reusableTokenStream is the only method left standing, isn't it wise to
hide actual reuse somewhere in Lucene internals and turn Analyzer into plain
and dumb factory interface?
{quote}

Hi Earwin: I completely agree that somehow Analyzer should be a plain and
dumb interface, but are you suggesting we should move the responsibility of
reuse onto the consumer? I think this could be challenging, alternatively there
might be a way to present a plain and dumb API with the reuse guts buried
inside Analyzer itself (like ReusableAnalyzerBase), and reuse enforced (e.g.
the tokenStream() is final and you cannot disable reuse). The trick would be
handling the special cases such as AnalyzerWrappers but I feel like we could
still do this.

Either way, I really think we should try to do this for 4.0. Though I think to
get there it would be safest if we addressed a few issues first:

* LUCENE-2788: make charfilters reusable, otherwise we will make the same
mistake again!
* LUCENE-3064: ensure consumers are properly using the API e.g. calling reset()
* LUCENE-3040: cut all consumers over to reusable API, so its really the one
left standing

LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers
--

Key: LUCENE-3055
URL: https://issues.apache.org/jira/browse/LUCENE-3055
Project: Lucene - Java
Issue Type: Bug
Components: Analysis
Affects Versions: 3.1
Reporter: Ian Soboroff

LUCENE-2372 and LUCENE-2389 marked all analyzers as final. This makes
ReusableAnalyzerBase useless, and makes it impossible to subclass e.g.
StandardAnalyzer to make a small modification e.g. to tokenStream(). These
issues don't indicate a new method of doing this. The issues don't give a
reason except for design considerations, which seems a poor reason to make a
backward-incompatible change

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2488) README.TXT mixes Unix and Windows path styles


 [ 
https://issues.apache.org/jira/browse/SOLR-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley resolved SOLR-2488.


Resolution: Fixed

Committed.  Thanks Jan!

 README.TXT mixes Unix and Windows path styles
 -

 Key: SOLR-2488
 URL: https://issues.apache.org/jira/browse/SOLR-2488
 Project: Solr
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.2, 4.0
Reporter: Jan Høydahl
Priority: Minor
 Attachments: SOLR-2488.patch


 README.TXT mixes Unix- and Windows-style syntaxes without further comments.
 Propose to change e.g. %JAVA_HOME%\bin - $JAVA_HOME/bin to be consistent and 
 add a comment about Windows elsewhere

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3064) add checks to MockTokenizer to enforce proper consumption

2011-05-03 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3064:


Attachment: LUCENE-3064.patch

updated patch with fixes for contrib, though highlighter still remains, and 
some TODOs are not resolved.

 add checks to MockTokenizer to enforce proper consumption
 -

 Key: LUCENE-3064
 URL: https://issues.apache.org/jira/browse/LUCENE-3064
 Project: Lucene - Java
  Issue Type: Test
Reporter: Robert Muir
 Fix For: 4.0

 Attachments: LUCENE-3064.patch, LUCENE-3064.patch


 we can enforce things like consumer properly iterates through tokenstream 
 lifeycle
 via MockTokenizer. this could catch bugs in consumers that don't call 
 reset(), etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3054) SorterTemplate.quickSort stack overflows on broken comparators that produce only few disticnt values in large arrays


 [ 
https://issues.apache.org/jira/browse/LUCENE-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-3054.
---

Resolution: Fixed

Committed trunk revision: 1099041
Merged 3.x revision: 1099045
Merged 3.1 revision: 1099046

 SorterTemplate.quickSort stack overflows on broken comparators that produce 
 only few disticnt values in large arrays
 

 Key: LUCENE-3054
 URL: https://issues.apache.org/jira/browse/LUCENE-3054
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: 3.1
Reporter: Robert Muir
Assignee: Uwe Schindler
Priority: Critical
 Fix For: 3.1.1, 3.2, 4.0

 Attachments: LUCENE-3054-dynamic.patch, 
 LUCENE-3054-stackoverflow.patch, LUCENE-3054.patch, LUCENE-3054.patch, 
 LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch, LUCENE-3054.patch


 Looking at Otis's sort problem on the mailing list, he said:
 {noformat}
 * looked for other places where this call is made - found it in
 MultiPhraseQuery$MultiPhraseWeight and changed that call from
 ArrayUtil.quickSort to ArrayUtil.mergeSort
 * now we no longer see SorterTemplate.quickSort in deep recursion when we do a
 thread dump
 {noformat}
 I thought this was interesting because PostingsAndFreq's comparator
 looks like it needs a tiebreaker.
 I think in our sorts we should add some asserts to try to catch some of these 
 broken comparators.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2191) Change SolrException cstrs that take Throwable to default to alreadyLogged=false

2011-05-03 Thread David Smiley (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028240#comment-13028240
]

David Smiley commented on SOLR-2191:

Ehh... I kind of like the notion but wether it is kept or not, I think a
general error/exception strategy needs to be devised.

In code I write, I tend to almost never log exceptions; I let them get to the
highest possible point to ensure they are logged there once, which is usually
one place. Beforehand I might catch an exception to do a log.error() to
provide some context and then rethrow the exception. I also wrap with
RuntimeExceptions.

An alternative is to log exceptions early (with contextual error message), and
then rethrow but don't log it higher up (e.g. earlier up) the stack. But how
can that early point know the exception has been handled? It can't generically
know making your suggestion of fix those code paths to be less chatty
problematic. Perhaps our code will always assume that we logged an exception
before wrapping it in SolrException right before we throw them. I think that's
a reasonable policy and wouldn't require an alreadyLogged flag.

Change SolrException cstrs that take Throwable to default to
alreadyLogged=false

Key: SOLR-2191
URL: https://issues.apache.org/jira/browse/SOLR-2191
Project: Solr
Issue Type: Bug
Reporter: Mark Miller
Fix For: Next

Attachments: SOLR-2191.patch

Because of misuse, many exceptions are now not logged at all - can be painful
when doing dev. I think we should flip this setting and work at removing any
double logging - losing logging is worse (and it almost looks like we lose
more logging than we would get in double logging) - and bad
solrexception/logging patterns are proliferating.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR

2011-05-03 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028243#comment-13028243
 ] 

David Smiley commented on SOLR-2487:


I like it Jan!  JDK14 logging sucks, any way.

 Do not include slf4j-jdk14 jar in WAR
 -

 Key: SOLR-2487
 URL: https://issues.apache.org/jira/browse/SOLR-2487
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.2, 4.0
Reporter: Jan Høydahl
  Labels: logging, slf4j

 I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help 
 newbies get up and running. But I find myself re-packaging the war for every 
 customer when adapting to their choice of logger framework, which is 
 counter-productive.
 It would be sufficient to have the jdk-logging binding in example/lib to let 
 the example and tutorial still work OOTB but as soon as you deploy solr.war 
 to production you're forced to explicitly decide what logging to use.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2191) Change SolrException cstrs that take Throwable to default to alreadyLogged=false

[
https://issues.apache.org/jira/browse/SOLR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028246#comment-13028246
]

Yonik Seeley commented on SOLR-2191:

bq. In code I write, I tend to almost never log exceptions; I let them get to
the highest possible point to ensure they are logged there once, which is
usually one place. Beforehand I might catch an exception to do a log.error() to
provide some context and then rethrow the exception.

Right. And logging immediately can be problematic since one may not know if
it's really an error that should be logged since Exceptions can sometimes be
handled (dismax is one example).

Anyway, certainly a +1 from me for changing the default of alreadyLogged and
improving the strategy in general.

Change SolrException cstrs that take Throwable to default to
alreadyLogged=false

Key: SOLR-2191
URL: https://issues.apache.org/jira/browse/SOLR-2191
Project: Solr
Issue Type: Bug
Reporter: Mark Miller
Fix For: Next

Attachments: SOLR-2191.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-236) Field collapsing

2011-05-03 Thread Stephen Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028252#comment-13028252
 ] 

Stephen Weiss commented on SOLR-236:


Yes, I've had this too:

https://issues.apache.org/jira/browse/SOLR-236?focusedCommentId=12655750page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12655750

I'm pretty sure I know the reason for it, but I don't know how to fix it... to 
the best of my knowledge no one on the ticket really said if the problem could 
be fixed or not yet either.  At the moment we just use facet.before and explain 
to our users that the facets are for unfiltered results...  almost no one 
complains once we explain it to them.  However, a fix would be *wonderful*... 
people ask about it often enough that clearly it's not very intuitive.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: Next

 Attachments: DocSetScoreCollector.java, 
 NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, 
 SOLR-236-1_4_1-NPEfix.patch, SOLR-236-1_4_1-paging-totals-working.patch, 
 SOLR-236-1_4_1.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-branch_3x.patch, SOLR-236-distinctFacet.patch, SOLR-236-trunk.patch, 
 SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, 
 SOLR-236-trunk.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch, 
 collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, 
 collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, solr-236.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-236) Field collapsing

2011-05-03 Thread Yuriy Akopov (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028256#comment-13028256
 ] 

Yuriy Akopov commented on SOLR-236:
---

Thanks, Stephen. So it isn't just me doing something else wrong.

I'm thinking of displaying not the actual figures against the facet items but 
something like 100+, 200+, 300+ etc. Should be okay as the difference is not 
dramatic but seems to remain within the relatively narrow interval.

 Field collapsing
 

 Key: SOLR-236
 URL: https://issues.apache.org/jira/browse/SOLR-236
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 1.3
Reporter: Emmanuel Keller
Assignee: Shalin Shekhar Mangar
 Fix For: Next

 Attachments: DocSetScoreCollector.java, 
 NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, 
 SOLR-236-1_4_1-NPEfix.patch, SOLR-236-1_4_1-paging-totals-working.patch, 
 SOLR-236-1_4_1.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, 
 SOLR-236-branch_3x.patch, SOLR-236-distinctFacet.patch, SOLR-236-trunk.patch, 
 SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, 
 SOLR-236-trunk.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, 
 SOLR-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch, 
 collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, 
 collapsing-patch-to-1.3.0-ivan_2.patch, 
 collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, 
 field-collapse-4-with-solrj.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-5.patch, field-collapse-5.patch, 
 field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, 
 field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, 
 field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, 
 field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, 
 quasidistributed.additional.patch, solr-236.patch


 This patch include a new feature called Field collapsing.
 Used in order to collapse a group of results with similar value for a given 
 field to a single entry in the result set. Site collapsing is a special case 
 of this, where all results for a given web site is collapsed into one or two 
 entries in the result set, typically with an associated more documents from 
 this site link. See also Duplicate detection.
 http://www.fastsearch.com/glossary.aspx?m=48amid=299
 The implementation add 3 new query parameters (SolrParams):
 collapse.field to choose the field used to group results
 collapse.type normal (default value) or adjacent
 collapse.max to select how many continuous results are allowed before 
 collapsing
 TODO (in progress):
 - More documentation (on source code)
 - Test cases
 Two patches:
 - field_collapsing.patch for current development version
 - field_collapsing_1.1.0.patch for Solr-1.1.0
 P.S.: Feedback and misspelling correction are welcome ;-)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: MergePolicy Thresholds

Thanks Shai!

I'm way behind on my 3.x backports -- I'll try to do this soon.

Mike

http://blog.mikemccandless.com

On Tue, May 3, 2011 at 8:10 AM, Shai Erera ser...@gmail.com wrote:
 I uploaded a patch to LUCENE-1076.

 Tom, apparently the patch I've attached before cannot be used, because there
 are dependencies (in earlier commits on LUCENE-1076) that need to be
 back-ported as well. So stay tuned on LUCENE-1076 for when it is safe to use
 this new MP.

 Shai

 On Tue, May 3, 2011 at 1:00 PM, Michael McCandless
 luc...@mikemccandless.com wrote:

 That'd be great, thanks :)

 Yes, let's iterate on the issue!  But: it should still be open, I hope
 (I didn't mean to close it yet, since it's not back ported)...

 Mike

 http://blog.mikemccandless.com

 On Tue, May 3, 2011 at 5:51 AM, Shai Erera ser...@gmail.com wrote:
  Mike, if you want, I can back-port it, as I've already started this when
  preparing the patch.
 
  I noticed that you added a throws IOE to IW.setInfoStream -- is it ok
  on
  3x too? It'll be a backwards change.
 
  Maybe we should iterate on the issue? I can reopen.
 
  Shai
 
  On Tue, May 3, 2011 at 12:36 PM, Michael McCandless
  luc...@mikemccandless.com wrote:
 
  Looks good Shai!
 
  Comments below too:
 
  On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote:
   Hi
  
   I looked into porting it to 3x, and prepared the attached patch. It
   only
   contains the new TieredMP and Test, as well as the necessary changes
   to
   LuceneTestCase and IndexWriter. I guess you can start with it (even
   just
   the
   MP and IW changes) to test it on your indexes.
  
   Mike, I saw that there were many more changes, as part of
   LUCENE-1076,
   done
   to the code. In particular, this MP is now the default (on trunk), so
   I
   guess many changes (to tests) were needed because of that. Do you
   remember,
   if apart from the changes I've included in the patch, other important
   changes w.r.t. this code?
 
  The only other changes I can think of were some verbosity improvements
  to IndexWriter, to support the python script that can make a merge
  movie from an infoStream output; but that can wait for when I
  back-port to 3.x...
 
   As we won't change the default MP on 3x, I'm guessing I don't need to
   port
   all the changes to 3x.
 
  Right, I think.
 
  Mike
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-3028) IW.getReader() returns inconsistent reader on RT Branch

2011-05-03 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-3028.
-

Resolution: Fixed

fixed in RT

 IW.getReader() returns inconsistent reader on RT Branch
 ---

 Key: LUCENE-3028
 URL: https://issues.apache.org/jira/browse/LUCENE-3028
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: Realtime Branch
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: Realtime Branch

 Attachments: LUCENE-3028.patch, LUCENE-3028.patch, realtime-1.txt


 I extended the testcase TestRollingUpdates#testUpdateSameDoc to pull a NRT 
 reader after each update and asserted that is always sees only one document. 
 Yet, this fails with current branch since there is a problem in how we flush 
 in the getReader() case. What happens here is that we flush all threads and 
 then release the lock (letting other flushes which came in after we entered 
 the flushAllThread context, continue) so that we could concurrently get a new 
 segment that transports global deletes without the corresponding add. They 
 sneak in while we continue to open the NRT reader which in turn sees 
 inconsistent results.
 I will upload a patch soon

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR

2011-05-03 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028283#comment-13028283
 ] 

Hoss Man commented on SOLR-2487:


bq. It would be sufficient to have the jdk-logging binding in example/lib to 
let the example and tutorial still work OOTB but as soon as you deploy solr.war 
to production you're forced to explicitly decide what logging to use.

Personally, that sounds like a terrible idea to me.

Novice users would try the demo, see that it works, then try deploying to some 
other servlet container and suddenly get errors unless the servlet container 
had already explicitly loaded some slf4j binding jar?

we already have plenty of users who get confused about how (and even *why*) 
they configure the solr home dir when deploying solr to a servlet container -- 
this would make it ever harder for beginners.

simple things should be simple -- novice users should be able to copy a jar, 
and copy configs, and be good to go.

for a user who cares about jdk14 logging vs log4j vs whatever, the task of 
customizing the war is simple and straightforward to understand -- but for a 
solr user who doesn't know anything about java, picking an slf4j binding and 
configuring their servlet container to load could easily appear like a daunting 
burden that will make them turn away from even using solr past the tutorial 
stage.

this really seems like a no brainer to me

 Do not include slf4j-jdk14 jar in WAR
 -

 Key: SOLR-2487
 URL: https://issues.apache.org/jira/browse/SOLR-2487
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.2, 4.0
Reporter: Jan Høydahl
  Labels: logging, slf4j

 I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help 
 newbies get up and running. But I find myself re-packaging the war for every 
 customer when adapting to their choice of logger framework, which is 
 counter-productive.
 It would be sufficient to have the jdk-logging binding in example/lib to let 
 the example and tutorial still work OOTB but as soon as you deploy solr.war 
 to production you're forced to explicitly decide what logging to use.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: modularization discussion

Isn't our end goal here a bunch of well factored search modules?  Ie,
fast forward a year or two and I think we should have modules like
these:

  * Faceting

  * Highlighting

  * Suggest (good patch is on LUCENE-2995)

  * Schema

  * Query impls

  * Query parsers

  * Analyzers (good progress here already, thanks Robert!),
incl. factories/XML configuration (still need this)

  * Database import (DIH)

  * Web app

  * Distribution/replication

  * Doc set representations

  * Collapse/grouping

  * Caches

  * Similarity/scoring impls (BM25, etc.)

  * Codecs

  * Joins

  * Lucene core

In this future, much of this code came from what is now Solr and
Lucene, but we should freely and aggressively poach from other
projects when appropriate (and license/provenance is OK).

I keep seeing all these cool compressed int set projects popping
up... surely these are useful for us.  Solr poached a doc set impl
from Nutch; probably there's other stuff to poach from Nutch, Mahout,
etc.

Katta's doing something sweet with distribution/replication; let's
poach  merge w/ Solr's approach.  There are various facet impls out
there (Bobo browse/Zoie; Toke's; Elastic Search); let's poach  merge
with Solr's.

Elastic Search has lots of cool stuff, too, under ASL2.

All these external open-source projects are fair game for poaching and
refactoring into shared modules, along with what is now Solr and
Lucene sources.

In this ideal future, Solr becomes the bundling and default/example
configuration of the Web App and other modules, much like how the
various Linux distros bundle different stuff together around the Linux
kernel.  And if you are an advanced app and don't need the webapp
part, you can cherry pick the huper duper modules you do need and
directly embedded into your app.

Isn't this the future we are working towards?

Mike

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: modularization discussion

On the namespace, since Yonik seems concerned about it, and others
aren't (I think?), why don't we leave everything factored out of Solr
under the under org.apache.solr namespace?

Anyone object to that approach?

My only concern is that this sends the message that the module depends
on Solr but, this turns into a non-issue once Solr is well
factored into modules, because by the time we arrive at that future,
depending on Solr just means depending on Solr modules, which
resolves my concern!

Mike

http://blog.mikemccandless.com

On Mon, May 2, 2011 at 6:11 PM, Grant Ingersoll gsing...@apache.org wrote:

 On Apr 27, 2011, at 11:45 PM, Greg Stein wrote:

 On Wed, Apr 27, 2011 at 09:25:14AM -0400, Yonik Seeley wrote:
 ...
 But as I said... it seems only fair to meet half way and use the solr 
 namespace
 for some modules and the lucene namespace for others.

 Please explain this part to me... I really don't understand.

 At the risk of speaking for someone else, I think it has to do w/ wanting to 
 maintain brand awareness for Solr.  We, as the PMC, currently produce two 
 products:  Apache Lucene and Apache Solr.  I believe Yonik's concern is that 
 if everything is just labeled Lucene, then Solr is just seen as a very thin 
 shell around Lucene (which, IMO, would still not be the case, since wiring 
 together a server app like Solr is non-trivial, but that is my opinion and 
 I'm not sure if Yonik share's it).  Solr has never been a thin shell around 
 Lucene and never will be.   However, In some ways, this gets at why I believe 
 Yonik was interested in a Solr TLP: so that Solr could stand on it's own as a 
 brand and as a first class Apache product steered by a PMC that is aligned 
 solely w/ producing the Solr (i.e. as a TLP) product as opposed to the two 
 products we produce now.  (Note, my vote on such a TLP was -1, so please 
 don't confuse me as arguing for the point, I'm just trying to, hopefully, 
 explain it)

 That being said, 99% of consumers of Solr never even know what is in the 
 underlying namespace b/c they only ever interact w/ Solr via HTTP (which has 
 solr in the namespace by default) at the server API level, so at least in my 
 mind, I don't care what the namespace used underneath is.  Call it lusolr for 
 all I care.


 What does fairness have to do with the codebase?

 I can't speak to this, but perhaps it's just the wrong choice of words and 
 would have been better said: please don't take this as a reason to gut Solr 
 and call everything Lucene.

 Isn't the whole
 point of the Lucene project to create the best code possible, for the
 benefit of our worldwide users?

 It is.  We do that primarily through the release of two products: Lucene and 
 Solr.  Lucene is a Java class library.  A good deal of programming is 
 required to create anything meaningful in terms of a production ready search 
 server.  Solr is a server that takes and makes most things that are 
 programming tasks in Lucene configuration tasks as well as adds a fair bit of 
 functionality (distributed search, replication, faceting, auto-suggest, etc.) 
 and is thus that much easier to put in production (I've seen people be in 
 production on Solr in a matter of days/weeks, I've never seen that in Lucene) 
  The crux of this debate is whether these additional pieces are better served 
 as modules (I think they are) or tightly coupled inside of Solr (which does 
 have a few benefits from a dev. point of view, even though I firmly believe 
 they are outweighed by the positives of modularization.)    And, while I 
 think most of us agree that modularization makes sense, that doesn't mean 
 there aren't reasons against it.  I also believe we need to take it on a case 
 by case basis.  I also don't think every patch has to be in it's final place 
 on first commit.  As Otis so often says, it's just software.  If it doesn't 
 work, change it.  Thus, if people contribute and it lands in Solr, the 
 committer who commits it need not immediately move it (although, hopefully 
 they will) or ask the contributor to do so, as that will likely dampen 
 contributions.  Likewise for Lucene.  Along with that, if and when others 
 wish to refactor, then they should by all means be allowed to do so assuming 
 of course, all tests across both products still pass.

 In short, I believe people should still contribute where they see they can 
 add the most value and according to their time schedules.  Additionally, 
 others who have more time or the ability to refactor for reusability should 
 be free to do so as well.

 I don't know what the outcome of this thread should be, so I guess we need to 
 just move forward and keep coding away and working to make things better.  Do 
 others see anything broader here?  A vote?  That would be symbolic, I guess, 
 but doesn't force anyone to do anything since there isn't a specific issue at 
 hand other than a broad concept that is seen as good.

 -Grant

[jira] [Commented] (SOLR-2487) Do not include slf4j-jdk14 jar in WAR


[ 
https://issues.apache.org/jira/browse/SOLR-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028316#comment-13028316
 ] 

Uwe Schindler commented on SOLR-2487:
-

+1 +1 +1 +1 ...

 Do not include slf4j-jdk14 jar in WAR
 -

 Key: SOLR-2487
 URL: https://issues.apache.org/jira/browse/SOLR-2487
 Project: Solr
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.2, 4.0
Reporter: Jan Høydahl
  Labels: logging, slf4j

 I know we've intentionally bundled slf4j-jdk14-1.5.5.jar in the war to help 
 newbies get up and running. But I find myself re-packaging the war for every 
 customer when adapting to their choice of logger framework, which is 
 counter-productive.
 It would be sufficient to have the jdk-logging binding in example/lib to let 
 the example and tutorial still work OOTB but as soon as you deploy solr.war 
 to production you're forced to explicitly decide what logging to use.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: modularization discussion

2011-05-03 Thread Mark Miller


On May 3, 2011, at 12:49 PM, Michael McCandless wrote:

 Isn't this the future we are working towards?

No, not really. Others perhaps, but not me. I'm on board with some modules. I 
do think there are tradeoffs when considering them and considering Lucene and 
Solr. I'm happy to take everything one issue at a time.

When I voted to merge, no, I certainly was not thinking, I hope in a year or 
two we have taken everything from Solr and made it a module. I did it for a few 
specific things to start - analyzers for sure, perhaps some other things as 
people did something that made sense. I did it so we could share some code more 
easily - not all code.

Others did it for their own reasons I assume.

But no - I'm not sure I have ever fully subscribed to what you are saying.

- Mark Miller
lucidimagination.com

Lucene/Solr User Conference
May 25-26, San Francisco
www.lucenerevolution.org






-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3023) Land DWPT on trunk


[ 
https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028327#comment-13028327
 ] 

Michael McCandless commented on LUCENE-3023:


This cutover to concurrent flushing (DWPT) produces astounding
increases in indexing throughput:

  http://people.apache.org/~mikemccand/lucenebench/indexing.html

186 GB plain text per hour (from 101 GB/hour the day before)!!!

It's not every day you see an 84% jump in indexing throughput!  Wow.

This is on a machine that has substantial CPU+IO concurrency, ie, it
was bottlenecked by our non-concurrent flush.

Also, I can now tune up the IW settings I use in those nightly
benchmarks; it's now 6 threads and only 512 MB RAM buffer.  I'll
wait a few days and then do that.

Looks like a few queries got a bit slower... I suspect this is because
the index segment count has changed.  Before concurrent flushing it
was this:
{noformat}
36(4.0):C4977400 _69(4.0):C4977400 _9c(4.0):C4977400 _cf(4.0):C4977400 
_fi(4.0):C4977400 _fq(4.0):C497740 _g1(4.0):C497740 _gc(4.0):C497740 
_gn(4.0):C497740 _gy(4.0):C497740 _gx(4.0):C49774 _gz(4.0):C49774 
_h0(4.0):C49774 _h1(4.0):C49774 _h2(4.0):C49774 _h3(4.0):C468
{noformat}

After concurrent flushing:
{noformat}
_3d(4.0):C4977400 _6h(4.0):C4977400 _9j(4.0):C4977400 _cn(4.0):C4977400 
_fq(4.0):C4977400 _fu(4.0):C497740 _g6(4.0):C497740 _gh(4.0):C497740 
_gs(4.0):C497740 _h2(4.0):C497740 _gy(4.0):C49774 _gz(4.0):C49774 
_h0(4.0):C49774 _h5(4.0):C4105 _1(4.0):C2627 _h4(4.0):C16331 _h3(4.0):C28728 
_h1(4.0):C48225
{noformat}

So we have 2 extra segments... it's interesting how this affects some
queries but not others.


 Land DWPT on trunk
 --

 Key: LUCENE-3023
 URL: https://issues.apache.org/jira/browse/LUCENE-3023
 Project: Lucene - Java
  Issue Type: Task
Affects Versions: CSF branch, 4.0
Reporter: Simon Willnauer
Assignee: Simon Willnauer
 Fix For: 4.0

 Attachments: LUCENE-3023-quicksort-reincarnation.patch, 
 LUCENE-3023-svn-diff.patch, LUCENE-3023-ws-changes.patch, LUCENE-3023.patch, 
 LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, 
 LUCENE-3023_CHANGES.patch, LUCENE-3023_CHANGES.patch, 
 LUCENE-3023_iw_iwc_jdoc.patch, LUCENE-3023_simonw_review.patch, 
 LUCENE-3023_svndiff.patch, LUCENE-3023_svndiff.patch, diffMccand.py, 
 diffSources.patch, diffSources.patch, realtime-TestAddIndexes-3.txt, 
 realtime-TestAddIndexes-5.txt, 
 realtime-TestIndexWriterExceptions-assert-6.txt, 
 realtime-TestIndexWriterExceptions-npe-1.txt, 
 realtime-TestIndexWriterExceptions-npe-2.txt, 
 realtime-TestIndexWriterExceptions-npe-4.txt, 
 realtime-TestOmitTf-corrupt-0.txt


 With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so 
 we can proceed landing the DWPT development on trunk soon. I think one of the 
 bigger issues here is to make sure that all JavaDocs for IW etc. are still 
 correct though. I will start going through that first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: modularization discussion

2011-05-03 Thread Ryan McKinley

On Tue, May 3, 2011 at 1:11 PM, Mark Miller markrmil...@gmail.com wrote:

 On May 3, 2011, at 12:49 PM, Michael McCandless wrote:

 Isn't this the future we are working towards?

 No, not really. Others perhaps, but not me. I'm on board with some modules. I 
 do think there are tradeoffs when considering them and considering Lucene and 
 Solr. I'm happy to take everything one issue at a time.


I hope the outcome of this discussion is a shared sense of the
relationship between lucene, solr, and modules -- we need some general
guidelines so that every time this comes up we don't have to have the
same discussion over and over.

Mike I agree with the general vision -- the details on how it would
actually work suggest that we may have to fast forward more then a
year or two for most of these things -- but who knows.

ryan

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

NumericField should be stored in binary format in index (matching Solr's format)


 Key: LUCENE-3065
 URL: https://issues.apache.org/jira/browse/LUCENE-3065
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0


(Spinoff of LUCENE-3001)

Today when writing stored fields we don't record that the field was a 
NumericField, and so at IndexReader time you get back an ordinary Field and 
your number has turned into a string.  See 
https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972

We have spare bits already in stored fields, so, we should use one to record 
that the field is numeric, and then encode the numeric field in Solr's 
more-compact binary format.

A nice side-effect is we fix the long standing issue that you don't get a 
NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: modularization discussion

2011-05-03 Thread Mark Miller


On May 3, 2011, at 1:29 PM, Shai Erera wrote:

 I don't like that approach. Two years from now, if indeed your vision becomes 
 the reality (obviously, not everyone think like you), what would o.a.solr 
 mean? Who will remember that 'suggest' (just picking an example) came from 
 Solr? Who'd care?
 
 Why, when I will integrate several modules together, will I need to see 
 o.a.lucene on some, and o.a.solr on others, when both come from the same 
 distro (even same tar.gz file, e.g. modules)?
 
 What makes sense, at least to me, is that either we call everything 
 o.a.lucene and solr becomes o.a.lucene.solr (I know I've probably pissed off 
 some people with that, sorry), or we come up w/ a new namespace (proposed by 
 Grant I think) o.a.lusolr. If we go with the second, then we'll have 3 
 namespaces:
 * o.a.lucene for core Lucene stuff (e.g. Lucene core, benchmark?)
 * o.a.solr for pure/core Solr stuff
 * o.a.lusolr for shared modules.

Honestly, I could go for any of those. I can't bring myself to get caught up 
caring long term what the package names are. You can't even make rules about 
that - they won't and shouldn't stand over time.

 
 Picking a good package name is important. And deciding to call everything 
 that came from Solr o.a.solr, just to not offend someone, is not the right 
 way to do things, at least IMO.

Yeah, its just not a sustainable idea for an open source project anyway.

 
 Mike, I do share with you the vision you outline, and I believe many of us 
 do. It will become a reality if we factor out modules from Solr and Lucene 
 under /modules. It can also become a reality if someone simply contributes 
 under /modules alternative packages for e.g. faceting, suggest, spellcheck 
 etc. If those are good packages, I doubt Solr would be reluctant to adopt 
 them.

 
 Either way, it's the community that will dictate the future of itself, and 
 not individuals. Perhaps we should stop discussing what can possibly happen, 
 and start doing things. Actions get more results than endless threads. This 
 have been stated on this thread numerous times -- if a contribution is good, 
 well coded, designed, thought of, it will go in. Whether it's a refactoring 
 of something, or a completely new code. I doubt there are people on this 
 community that can stand in the way of it.

This is really the crux of it. IMO, people should be much less concerned with 
how they perceive others, and more concerned with just doing things. The Apache 
rules are set up to deal with this type of thing. Those rules can get tricky, 
and nobody likes to fall back on them - but when you have strong disagreement, 
that is what they are there for. Not everyone on a project has to agree - nor 
do they have to have pure open source motives. That's just normal and 
expected. We are a very varied group. The more differences the better IMHO.

Just as a reminder - a couple things I see repeatedly at Apache:

community over code
merit does not expire

Other than that, the doers do, occasionally we vote, and in general things move 
along.

 
 Shai

- Mark Miller
lucidimagination.com

Lucene/Solr User Conference
May 25-26, San Francisco
www.lucenerevolution.org






-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)


[ 
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028354#comment-13028354
 ] 

Uwe Schindler commented on LUCENE-3065:
---

Ideally this could be done with the schema-like approach of one of the GSoC 
projects?

We already discussed about that: We can use the FieldsReader/FieldsWriter type 
flag (which currently says, binary/text and compressed (unused now)) in the 
index file format to mark a field as NumericField. In that case, 
Document.getField() would return the NumericField instance.

For Lucene backwards we should still support creating text-only fields.

The new binary format would also be compatible with solr, as on getField, Solr 
would get a NumericField and can decide using instanceof what to do. Old Solr 
indexes without the NumericField marker flag would return as byte[], in which 
case, solr would do the decoding.

For storing on index side, Solr could move to NumericField completely (I dont 
like the current approach using NumericTokenStream and to/fromInternal wrappers 
around conventional Field).

 NumericField should be stored in binary format in index (matching Solr's 
 format)
 

 Key: LUCENE-3065
 URL: https://issues.apache.org/jira/browse/LUCENE-3065
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0


 (Spinoff of LUCENE-3001)
 Today when writing stored fields we don't record that the field was a 
 NumericField, and so at IndexReader time you get back an ordinary Field and 
 your number has turned into a string.  See 
 https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972
 We have spare bits already in stored fields, so, we should use one to record 
 that the field is numeric, and then encode the numeric field in Solr's 
 more-compact binary format.
 A nice side-effect is we fix the long standing issue that you don't get a 
 NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

RE: MergePolicy Thresholds

2011-05-03 Thread Burton-West, Tom

Thanks Shai and Mike!

I'll keep an eye on LUCENE-1076.

Tom

-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com] 
Sent: Tuesday, May 03, 2011 11:15 AM
To: dev@lucene.apache.org
Subject: Re: MergePolicy Thresholds

Thanks Shai!

I'm way behind on my 3.x backports -- I'll try to do this soon.

Mike

http://blog.mikemccandless.com

On Tue, May 3, 2011 at 8:10 AM, Shai Erera ser...@gmail.com wrote:
 I uploaded a patch to LUCENE-1076.

 Tom, apparently the patch I've attached before cannot be used, because there
 are dependencies (in earlier commits on LUCENE-1076) that need to be
 back-ported as well. So stay tuned on LUCENE-1076 for when it is safe to use
 this new MP.

 Shai

 On Tue, May 3, 2011 at 1:00 PM, Michael McCandless
 luc...@mikemccandless.com wrote:

 That'd be great, thanks :)

 Yes, let's iterate on the issue!  But: it should still be open, I hope
 (I didn't mean to close it yet, since it's not back ported)...

 Mike

 http://blog.mikemccandless.com

 On Tue, May 3, 2011 at 5:51 AM, Shai Erera ser...@gmail.com wrote:
  Mike, if you want, I can back-port it, as I've already started this when
  preparing the patch.

  I noticed that you added a throws IOE to IW.setInfoStream -- is it ok
  on
  3x too? It'll be a backwards change.

  Maybe we should iterate on the issue? I can reopen.

  Shai

  On Tue, May 3, 2011 at 12:36 PM, Michael McCandless
  luc...@mikemccandless.com wrote:

  Looks good Shai!

  Comments below too:

  On Tue, May 3, 2011 at 5:29 AM, Shai Erera ser...@gmail.com wrote:
   Hi

   I looked into porting it to 3x, and prepared the attached patch. It
   only
   contains the new TieredMP and Test, as well as the necessary changes
   to
   LuceneTestCase and IndexWriter. I guess you can start with it (even
   just
   the
   MP and IW changes) to test it on your indexes.

   Mike, I saw that there were many more changes, as part of
   LUCENE-1076,
   done
   to the code. In particular, this MP is now the default (on trunk), so
   I
   guess many changes (to tests) were needed because of that. Do you
   remember,
   if apart from the changes I've included in the patch, other important
   changes w.r.t. this code?

  The only other changes I can think of were some verbosity improvements
  to IndexWriter, to support the python script that can make a merge
  movie from an infoStream output; but that can wait for when I
  back-port to 3.x...

   As we won't change the default MP on 3x, I'm guessing I don't need to
   port
   all the changes to 3x.

  Right, I think.

  Mike

  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[Lucene.Net] [jira] [Created] (LUCENENET-413) Medium trust security issue

2011-05-03 Thread Digy (JIRA)

 Medium trust security issue 
-

 Key: LUCENENET-413
 URL: https://issues.apache.org/jira/browse/LUCENENET-413
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.4
 Environment: Lucene.Net 2.9.4, Lucene.Net 2.9.4g , .Net 4.0
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4


On behalf of Richard Wilde:
Exceptions in Medium Trust(.NET 4.0)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[Lucene.Net] [jira] [Updated] (LUCENENET-413) Medium trust security issue

2011-05-03 Thread Digy (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENENET-413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Digy updated LUCENENET-413:
---

Attachment: MediumTrust.2.9.4g.patch
MediumTrust.2.9.4.patch

  Medium trust security issue 
 -

 Key: LUCENENET-413
 URL: https://issues.apache.org/jira/browse/LUCENENET-413
 Project: Lucene.Net
  Issue Type: Improvement
Affects Versions: Lucene.Net 2.9.4
 Environment: Lucene.Net 2.9.4, Lucene.Net 2.9.4g , .Net 4.0
Reporter: Digy
Priority: Minor
 Fix For: Lucene.Net 2.9.4

 Attachments: MediumTrust.2.9.4.patch, MediumTrust.2.9.4g.patch


 On behalf of Richard Wilde:
 Exceptions in Medium Trust(.NET 4.0)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)


 [ 
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-3065:
---

Attachment: LUCENE-3065.patch

Patch against 3.x.

I moved the to/from byte[] methods from Solr's TrieField into Lucene's
NumericUtils, and fixed FieldsWriter/Reader to use free bits in the
field's flags to know if the field is Numeric, and which type.

I added a random test case to verify we now get the right NumericField
back, when we stored NumericField during indexing.

Old indices are handled fine (you'll get a String-ified Field back like
you did before).

Spookily, nothing failed in Solr... I assume there's somewhere in Solr
that must now be fixed to handle the fact that a field can come back
as NumericField?  Anyone know where...?

 NumericField should be stored in binary format in index (matching Solr's 
 format)
 

 Key: LUCENE-3065
 URL: https://issues.apache.org/jira/browse/LUCENE-3065
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3065.patch


 (Spinoff of LUCENE-3001)
 Today when writing stored fields we don't record that the field was a 
 NumericField, and so at IndexReader time you get back an ordinary Field and 
 your number has turned into a string.  See 
 https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972
 We have spare bits already in stored fields, so, we should use one to record 
 that the field is numeric, and then encode the numeric field in Solr's 
 more-compact binary format.
 A nice side-effect is we fix the long standing issue that you don't get a 
 NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

[
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028395#comment-13028395
]

Uwe Schindler commented on LUCENE-3065:
---

{quote}
Spookily, nothing failed in Solr... I assume there's somewhere in Solr
that must now be fixed to handle the fact that a field can come back
as NumericField? Anyone know where...?
{quote}

Thats easy to understand: Solr does not use NumericField at all. It produces a
NumericTokenStream and indexes it like any other analyzer. The byte[] field is
indexed as a separate Field with only store=true and binary.

This is what I wanted to say with my last comment.

NumericField should be stored in binary format in index (matching Solr's
format)

Key: LUCENE-3065
URL: https://issues.apache.org/jira/browse/LUCENE-3065
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3065.patch

(Spinoff of LUCENE-3001)
Today when writing stored fields we don't record that the field was a
NumericField, and so at IndexReader time you get back an ordinary Field and
your number has turned into a string. See
https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972
We have spare bits already in stored fields, so, we should use one to record
that the field is numeric, and then encode the numeric field in Solr's
more-compact binary format.
A nice side-effect is we fix the long standing issue that you don't get a
NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2490) PropertiesRequestHandler; encode line.separator

PropertiesRequestHandler; encode line.separator
---

 Key: SOLR-2490
 URL: https://issues.apache.org/jira/browse/SOLR-2490
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Trivial


Currently, the XML looks like this:

{code}!-- .. --
str name=java.io.tmpdir/tmp/str
str name=line.separator
/str
str name=java.vm.specification.vendorSun Microsystems Inc./str
!-- .. --{code}

would be good to have this instead:

{code}!-- .. --
str name=java.io.tmpdir/tmp/str
str name=line.separator\n/str
str name=java.vm.specification.vendorSun Microsystems Inc./str
!-- .. --{code}

afterwords we will be able to display to used line seperator

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

[
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028398#comment-13028398
]

Michael McCandless commented on LUCENE-3065:

{quote}
Thats easy to understand: Solr does not use NumericField at all. It produces a
NumericTokenStream and indexes it like any other analyzer. The byte[] field is
indexed as a separate Field with only store=true and binary.

This is what I wanted to say with my last comment.
{quote}
A, OK. So, not spooky.

We should eventually fix that; shouldn't Solr just use NumericField instead of
doing this encode/decode itself? Is there some reason...?

NumericField should be stored in binary format in index (matching Solr's
format)

Key: LUCENE-3065
URL: https://issues.apache.org/jira/browse/LUCENE-3065
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3065.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

2011-05-03 Thread Ryan McKinley (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028399#comment-13028399
 ] 

Ryan McKinley commented on LUCENE-3065:
---

bq. Is there some reason...?

Solr did its own encoding/decoding so that it could store a binary field -- 
with this patch, that is not necessary anymore.

 NumericField should be stored in binary format in index (matching Solr's 
 format)
 

 Key: LUCENE-3065
 URL: https://issues.apache.org/jira/browse/LUCENE-3065
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3065.patch


 (Spinoff of LUCENE-3001)
 Today when writing stored fields we don't record that the field was a 
 NumericField, and so at IndexReader time you get back an ordinary Field and 
 your number has turned into a string.  See 
 https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972
 We have spare bits already in stored fields, so, we should use one to record 
 that the field is numeric, and then encode the numeric field in Solr's 
 more-compact binary format.
 A nice side-effect is we fix the long standing issue that you don't get a 
 NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

[
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028404#comment-13028404
]

Michael McCandless commented on LUCENE-3065:

Uwe: I agree, I'll use BytesRef in trunk.

Ryan: OK. Should we try to fix that w/ this issue? If so, can you take a
crack at it? Thanks. Or, we can postpone... not necessary for this initial
cutover.

NumericField should be stored in binary format in index (matching Solr's
format)

Key: LUCENE-3065
URL: https://issues.apache.org/jira/browse/LUCENE-3065
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3065.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked


[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028402#comment-13028402
 ] 

Stefan Matheis (steffkes) commented on SOLR-2399:
-

bq. Thanks for doing all this, Stefan!
I'm happy to contribute :)

bq. I looked at the Analysis screenshot and found it a bit hard to eyeball 
quickly because the whole things feels very pale, which makes it hard for an 
eye to quickly jump from tokenizer, to token filter, to next token filter, etc. 
It's also not immediately obvious what left side vs. right side are, so maybe a 
more visible Index-time Analysis and Query-time Analysis may help.
Thanks for the Feedback, really appreciated. Tried to Focus on the Text .. 
maybe there is too much gray around, yes :/ 
Maybe a vertical divider (from top to bottom) would help to realize the index 
vs. query thingy? more whitespace between both columns perhaps? What was the 
Text you've used for analysis? (Just to get a feeling, how your page looks like 
:)

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin 
 Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], 
 [Query-Form|http://files.mathe.is/solr-admin/02_query.png], 
 [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], 
 [Logging|http://files.mathe.is/solr-admin/07_logging.png], 
 [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], 
 [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], 
 [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], 
 [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked


[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028406#comment-13028406
 ] 

Stefan Matheis (steffkes) commented on SOLR-2399:
-

{quote}Rather then use: java-properties.jsp
can the JS hit: http://localhost:8983/solr/admin/properties{quote}
Ha, nice -- [already 
integrated|https://github.com/steffkes/solr-admin/commit/04af2c51b9f5f364cbbc79d09e42530213c8fb02],
 dropped out the .jsp. But noticed the the line.seperator is not 'encoded', 
already started an ticket for this: SOLR-2490

{quote}I like the landing dashboard you have, but it would be nice to have an 
big (optional) link to: 
http://localhost:8983/solr/browse
so that people starting with solr can see solr in action easily{quote}
Hmm, would be useful too have another admin-extra.html File also for the 
global Dashboard, not only on Core-Level? We could point to the Velocity-Thingy 
but default, and everybody is able to extend this for his own needs.


 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin 
 Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], 
 [Query-Form|http://files.mathe.is/solr-admin/02_query.png], 
 [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], 
 [Logging|http://files.mathe.is/solr-admin/07_logging.png], 
 [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], 
 [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], 
 [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], 
 [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2491) spellcheck.maxCollationTries breaks when using FieldCollapsing

spellcheck.maxCollationTries breaks when using FieldCollapsing
--

 Key: SOLR-2491
 URL: https://issues.apache.org/jira/browse/SOLR-2491
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 4.0
Reporter: James Dyer
Priority: Minor
 Fix For: 4.0


If specifying spellcheck.maxCollationTries and group=true on the same 
query, you never get any Spell Check Collations back.  The problem is that 
SpellCheckCollator relies on ResponseBuilder.getToLog().get(hits) to see how 
many results each test query returns.  When group=true, the toLog isn't 
populated so SpellCheckCollator is unable to find a collation that can return 
results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

2011-05-03 Thread Ryan McKinley (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028408#comment-13028408
]

Ryan McKinley commented on LUCENE-3065:
---

bq. If so, can you take a crack at it? Thanks. Or, we can postpone... not
necessary for this initial cutover.

I'll take a crack at it... but I don't think its necessary in the first pass

NumericField should be stored in binary format in index (matching Solr's
format)

Key: LUCENE-3065
URL: https://issues.apache.org/jira/browse/LUCENE-3065
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3065.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2491) spellcheck.maxCollationTries breaks when using FieldCollapsing

[
https://issues.apache.org/jira/browse/SOLR-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Dyer updated SOLR-2491:
-

Attachment: SOLR-2491.patch

This patch fixes the problem includes a unit test. This patch simply removes
the group parameter from any test queries prior to running them.

Note that the # of hits for each collation returned will always be the # of
_ungrouped_ hits. This is consistent with the fact that FieldCollapsing is
unable to tell us the number of grouped hits.

It is a bit disturbing to me how brittle getting the # of hits back via toLog
has proven to be. If someone can point to a less breakable way to do this it
would be appreciated.

spellcheck.maxCollationTries breaks when using FieldCollapsing
--

Key: SOLR-2491
URL: https://issues.apache.org/jira/browse/SOLR-2491
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 4.0
Reporter: James Dyer
Priority: Minor
Fix For: 4.0

Attachments: SOLR-2491.patch

If specifying spellcheck.maxCollationTries and group=true on the same
query, you never get any Spell Check Collations back. The problem is that
SpellCheckCollator relies on ResponseBuilder.getToLog().get(hits) to see
how many results each test query returns. When group=true, the toLog
isn't populated so SpellCheckCollator is unable to find a collation that can
return results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)


[ 
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028412#comment-13028412
 ] 

Yonik Seeley commented on LUCENE-3065:
--

bq. I'll take a crack at it... but I don't think its necessary in the first pass

Should we try to accept both (binary or numeric field coming back) so this 
isn't a needless index format break, or is there another lucene index format 
break in the cards soon anyway?

 NumericField should be stored in binary format in index (matching Solr's 
 format)
 

 Key: LUCENE-3065
 URL: https://issues.apache.org/jira/browse/LUCENE-3065
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3065.patch


 (Spinoff of LUCENE-3001)
 Today when writing stored fields we don't record that the field was a 
 NumericField, and so at IndexReader time you get back an ordinary Field and 
 your number has turned into a string.  See 
 https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972
 We have spare bits already in stored fields, so, we should use one to record 
 that the field is numeric, and then encode the numeric field in Solr's 
 more-compact binary format.
 A nice side-effect is we fix the long standing issue that you don't get a 
 NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

[
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028416#comment-13028416
]

Uwe Schindler commented on LUCENE-3065:
---

Mike: One thing about the bitmask and the 4 values. There is also an issue open
to extend NumericField by byte and short. Maybe we should reserve 3 bits
instead of 2 for the numeric field type - so 0x70 instead of 0x30 as mask? I
just want to reseve this one extra bit, so we dont need to do any dumb masks
and values later, if we extend.

About the index format change:
As described above, for Solr it's not a problem. New fields are always indexed
using NumericField. On the query side, when Document.getField is called, it
could simply check the return value with instanceof. If the getter returns not
a NumericField, Solr knows that it's binary and can decode manually. This would
safe backwards.

Else its no break at all if we support both stored field formats during
indexing somehow (in Lucene its string, returning a String Field or new binary
NumericField). The index format itsself does not change generally (no need to
bump version numbers, as we only use unused bits?)

NumericField should be stored in binary format in index (matching Solr's
format)

Key: LUCENE-3065
URL: https://issues.apache.org/jira/browse/LUCENE-3065
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3065.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-05-03 Thread Otis Gospodnetic (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028417#comment-13028417
 ] 

Otis Gospodnetic commented on SOLR-2399:


Stefan - I only looked at the screenshot you provided. and now I can't find 
the link, I thought it was in this issue.

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin 
 Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], 
 [Query-Form|http://files.mathe.is/solr-admin/02_query.png], 
 [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], 
 [Logging|http://files.mathe.is/solr-admin/07_logging.png], 
 [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], 
 [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], 
 [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], 
 [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.

[
https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrian Nistor updated LUCENE-3066:
--

Attachment: Test.java

NullPointerException when calling sizeInBytes and setHasVectors concurrently.
-

Key: LUCENE-3066
URL: https://issues.apache.org/jira/browse/LUCENE-3066
Project: Lucene - Java
Issue Type: Bug
Affects Versions: 3.1
Environment: java 1.6.0_24
Ubuntu 10.10
Reporter: Adrian Nistor
Attachments: Test.java

Hi,
I am encountering a NullPointerException when using
org.apache.lucene.index.SegmentInfo. It appears in version 3.1.0 and also in
revision 1099085 (May 3rd 2011).
The NullPointerException is thrown by SegmentInfo.sizeInBytes(false) when
calling SegmentInfo.sizeInBytes(false) and SegmentInfo.setHasVectors(true) in
parallel. When these methods are called sequentially, they do not throw any
exception.
I have attached a test that exposes this problem. If you set ExposeBug = true,
the methods are called concurrently and you get the NullPointerException. If
you
set ExposeBug = false, the methods are called sequentially, and there is no
exception. Note that, in the sequential version, the methods are called many
times (just like in the parallel version), and in different orders (just like
in the parallel version).
The concurrent test (ExposeBug = true) always throws NullPointerException
under heavy load (ManyIterations = 1). Under small load (e.g., if you
set ManyIterations = 10), the NullPointerException will not manifest. I
suppose
you need a certain thread interleaving for the NullPointerException to happen,
and thus you need the heavy load.
Is this a bug? Is there a patch for it?
Thanks!
Adrian

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.

NullPointerException when calling sizeInBytes and setHasVectors concurrently.
-

 Key: LUCENE-3066
 URL: https://issues.apache.org/jira/browse/LUCENE-3066
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 3.1
 Environment: java 1.6.0_24 
Ubuntu 10.10
Reporter: Adrian Nistor
 Attachments: Test.java

Hi,

I am encountering a NullPointerException when using 
org.apache.lucene.index.SegmentInfo. It appears in version 3.1.0 and also in 
revision 1099085 (May 3rd 2011).

The NullPointerException is thrown by SegmentInfo.sizeInBytes(false) when 
calling SegmentInfo.sizeInBytes(false) and SegmentInfo.setHasVectors(true) in 
parallel. When these methods are called sequentially, they do not throw any 
exception.

I have attached a test that exposes this problem. If you set ExposeBug = true,
the methods are called concurrently and you get the NullPointerException. If you
set  ExposeBug = false, the methods are called sequentially, and there is no
exception. Note that, in the sequential version, the methods are called many
times (just like in the parallel version), and in different orders (just like
in the parallel version).

The concurrent test (ExposeBug = true) always throws NullPointerException 
under heavy load (ManyIterations = 1). Under small load (e.g., if you
set ManyIterations = 10), the NullPointerException will not manifest. I suppose
you need a certain thread interleaving for the NullPointerException to happen,
and thus you need the heavy load.

Is this a bug? Is there a patch for it?

Thanks!

Adrian


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked


[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028419#comment-13028419
 ] 

Stefan Matheis (steffkes) commented on SOLR-2399:
-

bq. Stefan - I only looked at the screenshot you provided. and now I can't 
find the link, I thought it was in this issue.
Ah okay, thought you've already played around. Here is the screen again: 
http://files.mathe.is/solr-admin/04_analysis.png

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin 
 Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], 
 [Query-Form|http://files.mathe.is/solr-admin/02_query.png], 
 [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], 
 [Logging|http://files.mathe.is/solr-admin/07_logging.png], 
 [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], 
 [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], 
 [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], 
 [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-2492) DIH does not commit if only Deletes are processed

DIH does not commit if only Deletes are processed
-

 Key: SOLR-2492
 URL: https://issues.apache.org/jira/browse/SOLR-2492
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 3.1, 1.4.1, 4.0
Reporter: James Dyer
Priority: Minor
 Fix For: 3.2, 4.0


If a DIH run processes deletes using the $deleteDocById and/or 
$deleteDocByQuery special commands, and if no adds or updates get processed in 
the same run, then commit is never called.  Also, the # of deleted documents 
does not get incremented.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2492) DIH does not commit if only Deletes are processed

[
https://issues.apache.org/jira/browse/SOLR-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Dyer updated SOLR-2492:
-

Attachment: SOLR-2492.patch

This patch increments the # deleted documents once for each call to
$deleteDocById and/or $deleteDocByQuery. Note that it would be even better
(especially with ..byQuery) to get the actual # of deleted documents and
increment by that many.

By incrementing the # deleted documents, commit is called at the end of the
run as expected. This fixes the issue of commit not being called and also
causes the # of deleted documents to be reported back to the user. While this
is better than current behavior, the actual # of reported deletions may not be
accurate because a call to $deleteDocById may not actually delete a document.
Likewise a call to $deleteDocByQuery could delete more than 1 document (or
none).

A unit test is provided.

DIH does not commit if only Deletes are processed
-

Key: SOLR-2492
URL: https://issues.apache.org/jira/browse/SOLR-2492
Project: Solr
Issue Type: Bug
Components: contrib - DataImportHandler
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
Fix For: 3.2, 4.0

Attachments: SOLR-2492.patch

If a DIH run processes deletes using the $deleteDocById and/or
$deleteDocByQuery special commands, and if no adds or updates get processed
in the same run, then commit is never called. Also, the # of deleted
documents does not get incremented.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked

2011-05-03 Thread Otis Gospodnetic (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028429#comment-13028429
 ] 

Otis Gospodnetic commented on SOLR-2399:


Right, so if you look at the names of tokenizers and filters there, they are 
super light, almost like the background.  I think making them stand out more 
would be better - darker font, bolder...


 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin 
 Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], 
 [Query-Form|http://files.mathe.is/solr-admin/02_query.png], 
 [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], 
 [Logging|http://files.mathe.is/solr-admin/07_logging.png], 
 [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], 
 [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], 
 [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], 
 [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2399) Solr Admin Interface, reworked


[ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028439#comment-13028439
 ] 

Stefan Matheis (steffkes) commented on SOLR-2399:
-

bq. Right, so if you look at the names of tokenizers and filters there, they 
are super light, almost like the background. I think making them stand out more 
would be better - darker font, bolder...
Correct - just used on another monitor .. depends heavily on the settings, 
brightness/contrast -- will put that screen back on the todo-list and try a few 
things. will let you know if it's done :

Additional Thoughts Otis? Does not matter if they are related to existing 
Screens or other Features!

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin 
 Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], 
 [Query-Form|http://files.mathe.is/solr-admin/02_query.png], 
 [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], 
 [Logging|http://files.mathe.is/solr-admin/07_logging.png], 
 [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], 
 [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], 
 [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], 
 [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

[
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028454#comment-13028454
]

Uwe Schindler commented on LUCENE-3065:
---

There is still a problem - first the good news:

- If user calls Document.get(field), the returned string is as before, so there
is no break at all. The reason is the implementation of
NumericField.stringValue(), it returns what the user is used to from 3.0
- If a user calls getFieldable(field) all is fine, too. The only change is that
it not could return NumericField. If the user simply calls stringValue() all is
identical to 3.0

Problems start with:

- If user calls Document.getField(name) it returns Field (internally it casts
the getFieldable()) result to Field. But NumericField does not subclass Field
- ClassCastException.

How to handle this?

- Maybe change those methods to return AbstractField, but thats a binary break
and users will complain, because not everything works as expected
- Make NumericField subclass Field (and Field is unfinalized) - thats a bad
idea, because Field has too many methods / members that are out of scope
- Deprecate Document.getField() and make it internally do an instanceof check,
if it gets NumericField transform to a backwards-compatible Field? - This
method is already broken. If you request Lazy field loading it also throws
ClassCastEx (e.g. LUCENE-609).

Not sure how to proceed. Else the patch looks fine. I think simply ignoring
LazyField loading is fine, as numeric fields are a maximum of 8 bytes Else
we would need LazyNumericField :(

NumericField should be stored in binary format in index (matching Solr's
format)

Key: LUCENE-3065
URL: https://issues.apache.org/jira/browse/LUCENE-3065
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3065.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-2399) Solr Admin Interface, reworked


 [ 
https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-2399:


Attachment: SOLR-2399.patch

 Solr Admin Interface, reworked
 --

 Key: SOLR-2399
 URL: https://issues.apache.org/jira/browse/SOLR-2399
 Project: Solr
  Issue Type: Improvement
  Components: web gui
Reporter: Stefan Matheis (steffkes)
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2399.patch


 *The idea was to create a new, fresh (and hopefully clean) Solr Admin 
 Interface.* [Based on this 
 [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]]
 I've quickly created a Github-Repository (Just for me, to keep track of the 
 changes)
 » https://github.com/steffkes/solr-admin 
 Quick Tour: [Dashboard|http://files.mathe.is/solr-admin/01_dashboard.png], 
 [Query-Form|http://files.mathe.is/solr-admin/02_query.png], 
 [Plugins|http://files.mathe.is/solr-admin/05_plugins.png], 
 [Logging|http://files.mathe.is/solr-admin/07_logging.png], 
 [Analysis|http://files.mathe.is/solr-admin/04_analysis.png], 
 [Schema-Browser|http://files.mathe.is/solr-admin/06_schema-browser.png], 
 [Dataimport|http://files.mathe.is/solr-admin/08_dataimport.png], 
 [Core-Admin|http://files.mathe.is/solr-admin/09_coreadmin.png]
 Newly created Wiki-Page: http://wiki.apache.org/solr/ReworkedSolrAdminGUI

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Issue Comment Edited] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

[
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028454#comment-13028454
]

Uwe Schindler edited comment on LUCENE-3065 at 5/3/11 10:01 PM:

There is still a problem - first the good news:

- If user calls Document.get(field), the returned string is as before, so there
is no break at all. The reason is the implementation of
NumericField.stringValue(), it returns what the user is used to from 3.0
- If a user calls getFieldable(field) all is fine, too. The only change is that
it could return NumericField now. If the user simply calls stringValue() all is
identical to 3.0

Problems start with:

- If user calls Document.getField(name) it returns Field (internally it casts
the getFieldable()) result to Field. But NumericField does not subclass Field
- ClassCastException.

How to handle this?

Not sure how to proceed. Else the patch looks fine. I think simply ignoring
LazyField loading is fine, as numeric fields are a maximum of 8 bytes Else
we would need LazyNumericField :(

was (Author: thetaphi):
There is still a problem - first the good news:

Problems start with:

- If user calls Document.getField(name) it returns Field (internally it casts
the getFieldable()) result to Field. But NumericField does not subclass Field
- ClassCastException.

How to handle this?

Not sure how to proceed. Else the patch looks fine. I think simply ignoring
LazyField loading is fine, as numeric fields are a maximum of 8 bytes Else
we would need LazyNumericField :(

NumericField should be stored in binary format in index (matching Solr's
format)

Key: LUCENE-3065
URL: https://issues.apache.org/jira/browse/LUCENE-3065
Project: Lucene - Java
Issue Type: Bug
Components: Index
Reporter: Michael McCandless
Priority: Minor
Fix For: 3.2, 4.0

Attachments: LUCENE-3065.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.

[
https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028480#comment-13028480
]

Michael McCandless commented on LUCENE-3066:

SegmentInfo is not actually thread safe; access to it inside Lucene is supposed
to be guarded by IndexWriter's monitor lock.

That said, this issue looks alot like LUCENE-3051 -- is that where/how you hit
a problem here? Or something else...?

NullPointerException when calling sizeInBytes and setHasVectors concurrently.
-

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)


[ 
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028483#comment-13028483
 ] 

Michael McCandless commented on LUCENE-3065:


Ugh!  Field/Fieldable/AbstractField strikes again hmm not sure what to do.

 NumericField should be stored in binary format in index (matching Solr's 
 format)
 

 Key: LUCENE-3065
 URL: https://issues.apache.org/jira/browse/LUCENE-3065
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3065.patch


 (Spinoff of LUCENE-3001)
 Today when writing stored fields we don't record that the field was a 
 NumericField, and so at IndexReader time you get back an ordinary Field and 
 your number has turned into a string.  See 
 https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972
 We have spare bits already in stored fields, so, we should use one to record 
 that the field is numeric, and then encode the numeric field in Solr's 
 more-compact binary format.
 A nice side-effect is we fix the long standing issue that you don't get a 
 NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Closed] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.

[
https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrian Nistor closed LUCENE-3066.
-

Resolution: Not A Problem

NullPointerException when calling sizeInBytes and setHasVectors concurrently.
-

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3066) NullPointerException when calling sizeInBytes and setHasVectors concurrently.

[
https://issues.apache.org/jira/browse/LUCENE-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028496#comment-13028496
]

Adrian Nistor commented on LUCENE-3066:
---

Hi Michael,

Thanks for the super fast reply!

SegmentInfo is not actually thread safe; access to it inside Lucene is
supposed to be guarded by IndexWriter's monitor lock.

Ah, very sorry, I did not realize this.

That said, this issue looks a lot like LUCENE-3051 - is that where/how you
hit
a problem here? Or something else...?

No, totally unrelated. I am testing a tool for testing concurrent code. I
assumed that SegmentInfo is supposed to be thread safe and thus a good
candidate
for testing.

Thanks again for your reply and very sorry the trouble!

Thanks!

Adrian

NullPointerException when calling sizeInBytes and setHasVectors concurrently.
-

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

2011-05-03 Thread Chris Male (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028526#comment-13028526
 ] 

Chris Male commented on LUCENE-3065:


The Field/Fieldable/AbstractField problem is what I've been addressing in 
LUCENE-2310.  There I took the step of making NumericField extend Field, with a 
series of unsupported fields.  This seemed easiest to do particularly with 
FieldType in mind.  I then deprecated all the Fieldable methods in Document.

 NumericField should be stored in binary format in index (matching Solr's 
 format)
 

 Key: LUCENE-3065
 URL: https://issues.apache.org/jira/browse/LUCENE-3065
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Reporter: Michael McCandless
Priority: Minor
 Fix For: 3.2, 4.0

 Attachments: LUCENE-3065.patch


 (Spinoff of LUCENE-3001)
 Today when writing stored fields we don't record that the field was a 
 NumericField, and so at IndexReader time you get back an ordinary Field and 
 your number has turned into a string.  See 
 https://issues.apache.org/jira/browse/LUCENE-1701?focusedCommentId=12721972page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12721972
 We have spare bits already in stored fields, so, we should use one to record 
 that the field is numeric, and then encode the numeric field in Solr's 
 more-compact binary format.
 A nice side-effect is we fix the long standing issue that you don't get a 
 NumericField back when loading your document.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)