[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer

2013-02-25 Thread philip hoy (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585772#comment-13585772
 ] 

philip hoy commented on SOLR-4449:
--

Hi Raintang, My apologies I think I have been sloppy in my language and that 
may have misled you. When I referred to shards in my comments previously, 
really I meant replicas. This jira covers how the replicas are load balanced 
not how the shard/slice requests are managed. I would think that should be 
dealt with in a separate issue. 

 Enable backup requests for the internal solr load balancer
 --

 Key: SOLR-4449
 URL: https://issues.apache.org/jira/browse/SOLR-4449
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: philip hoy
Priority: Minor
 Attachments: patch-4449.txt, SOLR-4449.patch


 Add the ability to configure the built-in solr load balancer such that it 
 submits a backup request to the next server in the list if the initial 
 request takes too long. Employing such an algorithm could improve the latency 
 of the 9xth percentile albeit at the expense of increasing overall load due 
 to additional requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Sorting issue

2013-02-25 Thread Chaturvedi, Puneet
Hi,
I have an issue with sorting in solr . I want to sort the solr results on the 
basis of a field which has indexed property set to true and is not 
multivalued. I am setting the sorting parameters using the addSortField 
method. Still I cannot see the solr results not sorted . Can you please guide 
me as to what needs to be done to solve this issue?

Thanks  Regards,
Puneet Chaturvedi


Line length in Lucene/Solr code

2013-02-25 Thread Toke Eskildsen
According to https://wiki.apache.org/solr/HowToContribute, Sun's code
style conventions should be used when writing contributions for Lucene
and Solr. Said conventions state that lines in code should be 80
characters or less, since they're not handled well by many terminals
and tools:
http://www.oracle.com/technetwork/java/javase/documentation/codeconventions-136091.html#313

A quick random inspection of the Lucene/Solr code base tells me that
this recommendation is not followed: Out of 20 source files, only a
single one adhered to the 80 characters/line limit and that was
StorageField, which is an interface.

I am all for a larger limit as I find that it makes Java code a lot more
readable. With current tools, Java code needs to be formatted using line
breaks and indents (as opposed to fully dynamic tool-specific re-flow of
the code). That formatting is dependent on a specific maximum line width
to be consistent.


With that in mind, I suggest that the code style recommendation is
expanded with the notion that a maximum of x characters/line should be
used, where x is something more than 80. Judging by a quick search, 120
chars seems to be a common choice.

Regards,
Toke Eskildsen


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Line length in Lucene/Solr code

2013-02-25 Thread Uwe Schindler
+1 to raise the default of 80 to a minimum of 120. I really hate short lines 
(and I find that the longer lines are much more readable) :-)

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 Sent: Monday, February 25, 2013 11:39 AM
 To: dev@lucene.apache.org
 Subject: Line length in Lucene/Solr code
 
 According to https://wiki.apache.org/solr/HowToContribute, Sun's code style
 conventions should be used when writing contributions for Lucene and Solr.
 Said conventions state that lines in code should be 80 characters or less,
 since they're not handled well by many terminals and tools:
 http://www.oracle.com/technetwork/java/javase/documentation/codecon
 ventions-136091.html#313
 
 A quick random inspection of the Lucene/Solr code base tells me that this
 recommendation is not followed: Out of 20 source files, only a single one
 adhered to the 80 characters/line limit and that was StorageField, which is an
 interface.
 
 I am all for a larger limit as I find that it makes Java code a lot more 
 readable.
 With current tools, Java code needs to be formatted using line breaks and
 indents (as opposed to fully dynamic tool-specific re-flow of the code). That
 formatting is dependent on a specific maximum line width to be consistent.
 
 
 With that in mind, I suggest that the code style recommendation is expanded
 with the notion that a maximum of x characters/line should be used, where x
 is something more than 80. Judging by a quick search, 120 chars seems to be
 a common choice.
 
 Regards,
 Toke Eskildsen
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Line length in Lucene/Solr code

2013-02-25 Thread Michael McCandless
+1

Mike McCandless

http://blog.mikemccandless.com


On Mon, Feb 25, 2013 at 5:59 AM, Uwe Schindler u...@thetaphi.de wrote:
 +1 to raise the default of 80 to a minimum of 120. I really hate short lines 
 (and I find that the longer lines are much more readable) :-)

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

 -Original Message-
 From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 Sent: Monday, February 25, 2013 11:39 AM
 To: dev@lucene.apache.org
 Subject: Line length in Lucene/Solr code

 According to https://wiki.apache.org/solr/HowToContribute, Sun's code style
 conventions should be used when writing contributions for Lucene and Solr.
 Said conventions state that lines in code should be 80 characters or less,
 since they're not handled well by many terminals and tools:
 http://www.oracle.com/technetwork/java/javase/documentation/codecon
 ventions-136091.html#313

 A quick random inspection of the Lucene/Solr code base tells me that this
 recommendation is not followed: Out of 20 source files, only a single one
 adhered to the 80 characters/line limit and that was StorageField, which is 
 an
 interface.

 I am all for a larger limit as I find that it makes Java code a lot more 
 readable.
 With current tools, Java code needs to be formatted using line breaks and
 indents (as opposed to fully dynamic tool-specific re-flow of the code). That
 formatting is dependent on a specific maximum line width to be consistent.


 With that in mind, I suggest that the code style recommendation is expanded
 with the notion that a maximum of x characters/line should be used, where x
 is something more than 80. Judging by a quick search, 120 chars seems to be
 a common choice.

 Regards,
 Toke Eskildsen


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Line length in Lucene/Solr code

2013-02-25 Thread Shai Erera
+1

Shai


On Mon, Feb 25, 2013 at 1:01 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 +1

 Mike McCandless

 http://blog.mikemccandless.com


 On Mon, Feb 25, 2013 at 5:59 AM, Uwe Schindler u...@thetaphi.de wrote:
  +1 to raise the default of 80 to a minimum of 120. I really hate short
 lines (and I find that the longer lines are much more readable) :-)
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
  -Original Message-
  From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
  Sent: Monday, February 25, 2013 11:39 AM
  To: dev@lucene.apache.org
  Subject: Line length in Lucene/Solr code
 
  According to https://wiki.apache.org/solr/HowToContribute, Sun's code
 style
  conventions should be used when writing contributions for Lucene and
 Solr.
  Said conventions state that lines in code should be 80 characters or
 less,
  since they're not handled well by many terminals and tools:
  http://www.oracle.com/technetwork/java/javase/documentation/codecon
  ventions-136091.html#313
 
  A quick random inspection of the Lucene/Solr code base tells me that
 this
  recommendation is not followed: Out of 20 source files, only a single
 one
  adhered to the 80 characters/line limit and that was StorageField,
 which is an
  interface.
 
  I am all for a larger limit as I find that it makes Java code a lot
 more readable.
  With current tools, Java code needs to be formatted using line breaks
 and
  indents (as opposed to fully dynamic tool-specific re-flow of the
 code). That
  formatting is dependent on a specific maximum line width to be
 consistent.
 
 
  With that in mind, I suggest that the code style recommendation is
 expanded
  with the notion that a maximum of x characters/line should be used,
 where x
  is something more than 80. Judging by a quick search, 120 chars seems
 to be
  a common choice.
 
  Regards,
  Toke Eskildsen
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional
  commands, e-mail: dev-h...@lucene.apache.org
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Line length in Lucene/Solr code

2013-02-25 Thread Christian Moen
+1

Christian Moen
http://www.atilika.com

On Feb 25, 2013, at 8:01 PM, Michael McCandless luc...@mikemccandless.com 
wrote:

 +1
 
 Mike McCandless
 
 http://blog.mikemccandless.com
 
 
 On Mon, Feb 25, 2013 at 5:59 AM, Uwe Schindler u...@thetaphi.de wrote:
 +1 to raise the default of 80 to a minimum of 120. I really hate short lines 
 (and I find that the longer lines are much more readable) :-)
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 -Original Message-
 From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 Sent: Monday, February 25, 2013 11:39 AM
 To: dev@lucene.apache.org
 Subject: Line length in Lucene/Solr code
 
 According to https://wiki.apache.org/solr/HowToContribute, Sun's code style
 conventions should be used when writing contributions for Lucene and Solr.
 Said conventions state that lines in code should be 80 characters or less,
 since they're not handled well by many terminals and tools:
 http://www.oracle.com/technetwork/java/javase/documentation/codecon
 ventions-136091.html#313
 
 A quick random inspection of the Lucene/Solr code base tells me that this
 recommendation is not followed: Out of 20 source files, only a single one
 adhered to the 80 characters/line limit and that was StorageField, which is 
 an
 interface.
 
 I am all for a larger limit as I find that it makes Java code a lot more 
 readable.
 With current tools, Java code needs to be formatted using line breaks and
 indents (as opposed to fully dynamic tool-specific re-flow of the code). 
 That
 formatting is dependent on a specific maximum line width to be consistent.
 
 
 With that in mind, I suggest that the code style recommendation is expanded
 with the notion that a maximum of x characters/line should be used, where x
 is something more than 80. Judging by a quick search, 120 chars seems to be
 a common choice.
 
 Regards,
 Toke Eskildsen
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Line length in Lucene/Solr code

2013-02-25 Thread Simon Willnauer
+1

On Mon, Feb 25, 2013 at 12:05 PM, Christian Moen c...@atilika.com wrote:
 +1

 Christian Moen
 http://www.atilika.com

 On Feb 25, 2013, at 8:01 PM, Michael McCandless luc...@mikemccandless.com 
 wrote:

 +1

 Mike McCandless

 http://blog.mikemccandless.com


 On Mon, Feb 25, 2013 at 5:59 AM, Uwe Schindler u...@thetaphi.de wrote:
 +1 to raise the default of 80 to a minimum of 120. I really hate short 
 lines (and I find that the longer lines are much more readable) :-)

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

 -Original Message-
 From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 Sent: Monday, February 25, 2013 11:39 AM
 To: dev@lucene.apache.org
 Subject: Line length in Lucene/Solr code

 According to https://wiki.apache.org/solr/HowToContribute, Sun's code style
 conventions should be used when writing contributions for Lucene and Solr.
 Said conventions state that lines in code should be 80 characters or less,
 since they're not handled well by many terminals and tools:
 http://www.oracle.com/technetwork/java/javase/documentation/codecon
 ventions-136091.html#313

 A quick random inspection of the Lucene/Solr code base tells me that this
 recommendation is not followed: Out of 20 source files, only a single one
 adhered to the 80 characters/line limit and that was StorageField, which 
 is an
 interface.

 I am all for a larger limit as I find that it makes Java code a lot more 
 readable.
 With current tools, Java code needs to be formatted using line breaks and
 indents (as opposed to fully dynamic tool-specific re-flow of the code). 
 That
 formatting is dependent on a specific maximum line width to be consistent.


 With that in mind, I suggest that the code style recommendation is expanded
 with the notion that a maximum of x characters/line should be used, where x
 is something more than 80. Judging by a quick search, 120 chars seems to be
 a common choice.

 Regards,
 Toke Eskildsen


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4771) Query-time join collectors could maybe be more efficient

2013-02-25 Thread Martijn van Groningen (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-4771:
--

Attachment: LUCENE-4771-prototype.patch

Updated patch with the current trunk codebase.

 Query-time join collectors could maybe be more efficient
 

 Key: LUCENE-4771
 URL: https://issues.apache.org/jira/browse/LUCENE-4771
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/join
Reporter: Robert Muir
 Attachments: LUCENE-4771_prototype.patch, 
 LUCENE-4771-prototype.patch, LUCENE-4771_prototype_without_bug.patch


 I was looking @ these collectors on LUCENE-4765 and I noticed:
 * SingleValued collector (SV) pulls FieldCache.getTerms and adds the bytes to 
 a bytesrefhash per-collect.
 * MultiValued  collector (MV) pulls FieldCache.getDocTermsOrds, but doesnt 
 use the ords, just looks up each value and adds the bytes per-collect.
 I think instead its worth investigating if SV should use getTermsIndex, and 
 both collectors just collect-up their per-segment ords in something like a 
 BitSet[maxOrd]. 
 When asked for the terms at the end in getCollectorTerms(), they could merge 
 these into one BytesRefHash.
 Of course, if you are going to turn around and execute the query against the 
 same searcher anyway (is this the typical case?), this could even be more 
 efficient: No need to hash or instantiate all the terms in memory, we could 
 do postpone the lookups to SeekingTermSetTermsEnum.accept()/nextSeekTerm() i 
 think... somehow :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Race condition in Solr, plea fo help and/or advice

2013-02-25 Thread Erick Erickson
OK, in working on SOLR-4196 I'm exercising opening/closing cores as never
before. I have a little stress program that does about the worst thing
possible, essentially opens and closes a core for every request. It has a
bunch of query and update threads running simultaneously that pick a random
core and do a query or update. I've got a bunch of code that, I think,
prevents any attempt to open _or_ close a core while it is being either
opened or closed by another thread (but I'm verifying).

It runs fine for a couple of hours, then hits a race condition. I was able
to get a stack trace (see below).

CloserThread.run(CoreContainer.java:1920) (second thread below) is, indeed,
new code. The stress-test program is updating cores (which may not be
loaded) like crazy and doing queries on other random cores. It's perfectly
possible to be updating a core that's in the process of being closed, I was
counting on the ref counting to make this OK... The cores are transient in
a limited cache, so they come and go. It looks like I'm trying to close a
core at the same time an update has come in, but I'm not sure whether this
is something that should be prevented from the new code or is an underlying
problem.

So a couple of questions:
1 SOLR-4196 has a whole series of improvements that even let us get here.
Running the stress test program against current trunk barfs before having
time to hit this condition, so the current state is an improvement. What do
you think about me checking 4196 in and opening a separate JIRA for this
issue?

2 Any suggestions on what direction to go next? If it's something easy, I
can just fold it into this patch.

3 Am I just going about things bass-ackwards? Not an unusual state of
affairs unfortunately.

NOTE: The current patch for SOLR-4196 isn't the one running with this code,
there are a couple more things I want change. Mostly I'm asking if someone
familiar with the code where the race is encountered has a quick fix

Thanks,
Erick


Found one Java-level deadlock:
=
commitScheduler-122579-thread-1:
  waiting to lock monitor 7f87c3076d38 (object 78b379a28, a
org.apache.solr.update.DefaultSolrCoreState),
  which is held by Thread-15
Thread-15:
  waiting for ownable synchronizer 765e84638, (a
java.util.concurrent.locks.ReentrantLock$NonfairSync),
  which is held by commitScheduler-122579-thread-1

Java stack information for the threads listed above:
===
commitScheduler-122579-thread-1:
at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:82)
- waiting to lock 78b379a28 (a
org.apache.solr.update.DefaultSolrCoreState)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1354)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:573)
- locked 76aa46f58 (a java.lang.Object)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
 Thread-15:
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  765e84638 (a
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
at
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
at
org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:680)
at
org.apache.solr.update.DefaultSolrCoreState.closeIndexWriter(DefaultSolrCoreState.java:68)
- locked 78b379a28 (a org.apache.solr.update.DefaultSolrCoreState)
at
org.apache.solr.update.DefaultSolrCoreState.close(DefaultSolrCoreState.java:289)
- locked 78b379a28 (a org.apache.solr.update.DefaultSolrCoreState)
at
org.apache.solr.update.SolrCoreState.decrefSolrCoreState(SolrCoreState.java:68)
- locked 78b379a28 (a org.apache.solr.update.DefaultSolrCoreState)
at org.apache.solr.core.SolrCore.close(SolrCore.java:975)
at org.apache.solr.core.CloserThread.run(CoreContainer.java:1920)

Found 1 deadlock.


[jira] [Created] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-4795:
--

 Summary: Add FacetsCollector based on SortedSetDocValues
 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless


Recently (LUCENE-4765) we added multi-valued DocValues field
(SortedSetDocValuesField), and this can be used for faceting in Solr
(SOLR-4490).  I think we should also add support in the facet module?

It'd be an option with different tradeoffs.  Eg, it wouldn't require
the taxonomy index, since the main index handles label/ord resolving.

There are at least two possible approaches:

  * On every reopen, build the seg - global ord map, and then on
every collect, get the seg ord, map it to the global ord space,
and increment counts.  This adds cost during reopen in proportion
to number of unique terms ...

  * On every collect, increment counts based on the seg ords, and then
do a merge in the end just like distributed faceting does.

The first approach is much easier so I built a quick prototype using
that.  The prototype does the counting, but it does NOT do the top K
facets gathering in the end, and it doesn't know parent/child ord
relationships, so there's tons more to do before this is real.  I also
was unsure how to properly integrate it since the existing classes
seem to expect that you use a taxonomy index to resolve ords.

I ran a quick performance test.  base = trunk except I disabled the
compute top-K in FacetsAccumulator to make the comparison fair; comp
= using the prototype collector in the patch:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
   OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
-23.6% ( -28% -  -18%)
HighTerm   21.58  (2.4%)   16.53  (3.7%)  
-23.4% ( -28% -  -17%)
   OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
-23.2% ( -28% -  -17%)
 Prefix3   14.37  (1.5%)   11.62  (3.5%)  
-19.1% ( -23% -  -14%)
 LowTerm  130.80  (1.6%)  106.95  (2.4%)  
-18.2% ( -21% -  -14%)
  OrHighHigh9.60  (2.6%)7.88  (3.5%)  
-17.9% ( -23% -  -12%)
 AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
-15.7% ( -18% -  -13%)
  Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
-12.0% ( -15% -   -7%)
 MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
-11.5% ( -15% -   -7%)
 MedTerm   51.43  (2.0%)   46.21  (2.7%)  
-10.2% ( -14% -   -5%)
  IntNRQ4.02  (1.6%)3.63  (4.0%)   
-9.7% ( -15% -   -4%)
Wildcard   29.14  (1.5%)   26.46  (2.5%)   
-9.2% ( -13% -   -5%)
HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
-5.4% ( -15% -5%)
 MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
-5.3% (  -9% -0%)
HighSpanNear3.55  (2.4%)3.38  (2.0%)   
-4.9% (  -9% -0%)
  AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
-3.5% (  -5% -   -1%)
 LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
-2.0% (  -8% -4%)
   LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
-1.6% ( -11% -   10%)
  Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
-1.4% (  -7% -5%)
 LowSpanNear8.42  (3.2%)8.45  (3.0%)
0.3% (  -5% -6%)
 Respell   45.17  (4.3%)   45.38  (4.4%)
0.5% (  -7% -9%)
   MedPhrase  113.93  (5.8%)  115.02  (4.9%)
1.0% (  -9% -   12%)
  AndHighLow  596.42  (2.5%)  617.12  (2.8%)
3.5% (  -1% -8%)
  HighPhrase   17.30 (10.5%)   18.36  (9.1%)
6.2% ( -12% -   28%)
{noformat}

I'm impressed that this approach is only ~24% slower in the worst
case!  I think this means it's a good option to make available?  Yes
it has downsides (NRT reopen more costly, small added RAM usage,
slightly slower faceting), but it's also simpler (no taxo index to
manage).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4795:
---

Attachment: LUCENE-4795.patch

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower in the worst
 case!  I think this means it's a good option to make available?  Yes
 it has downsides (NRT reopen more costly, small added RAM usage,
 slightly slower faceting), but it's also simpler (no taxo index to
 manage).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: 

[jira] [Updated] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-4795:


Attachment: pleaseBenchmarkMe.patch

Thanks for benchmarking this approach Mike! 

I'm happy with the results, though i still added a TODO that we should 
investigate the cost of the special packed-ints compression we do.

can you benchmark the attached change just out of curiousity?

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower in the worst
 case!  I think this means it's a good option to make available?  Yes
 it has downsides (NRT reopen more costly, small added 

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585878#comment-13585878
 ] 

Adrien Grand commented on LUCENE-4795:
--

Not having to manage a taxonomy index is very appealing to me!

What about collecting based on segment ords and bulk translating these ords to 
the global ords in setNextReader and when the collection ends? This way 
ordinalMap.get would be called less often (once per value per segment instead 
of once per value per doc per segment) and in a sequential way so I assume it 
would be faster while remaining easy to implement?

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 

Re: Line length in Lucene/Solr code

2013-02-25 Thread Jan Høydahl
+1
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

25. feb. 2013 kl. 13:37 skrev Simon Willnauer simon.willna...@gmail.com:

 +1
 
 On Mon, Feb 25, 2013 at 12:05 PM, Christian Moen c...@atilika.com wrote:
 +1
 
 Christian Moen
 http://www.atilika.com
 
 On Feb 25, 2013, at 8:01 PM, Michael McCandless luc...@mikemccandless.com 
 wrote:
 
 +1
 
 Mike McCandless
 
 http://blog.mikemccandless.com
 
 
 On Mon, Feb 25, 2013 at 5:59 AM, Uwe Schindler u...@thetaphi.de wrote:
 +1 to raise the default of 80 to a minimum of 120. I really hate short 
 lines (and I find that the longer lines are much more readable) :-)
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 -Original Message-
 From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 Sent: Monday, February 25, 2013 11:39 AM
 To: dev@lucene.apache.org
 Subject: Line length in Lucene/Solr code
 
 According to https://wiki.apache.org/solr/HowToContribute, Sun's code 
 style
 conventions should be used when writing contributions for Lucene and Solr.
 Said conventions state that lines in code should be 80 characters or less,
 since they're not handled well by many terminals and tools:
 http://www.oracle.com/technetwork/java/javase/documentation/codecon
 ventions-136091.html#313
 
 A quick random inspection of the Lucene/Solr code base tells me that this
 recommendation is not followed: Out of 20 source files, only a single one
 adhered to the 80 characters/line limit and that was StorageField, which 
 is an
 interface.
 
 I am all for a larger limit as I find that it makes Java code a lot more 
 readable.
 With current tools, Java code needs to be formatted using line breaks and
 indents (as opposed to fully dynamic tool-specific re-flow of the code). 
 That
 formatting is dependent on a specific maximum line width to be consistent.
 
 
 With that in mind, I suggest that the code style recommendation is 
 expanded
 with the notion that a maximum of x characters/line should be used, where 
 x
 is something more than 80. Judging by a quick search, 120 chars seems to 
 be
 a common choice.
 
 Regards,
 Toke Eskildsen
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585882#comment-13585882
 ] 

Michael McCandless commented on LUCENE-4795:


bq. What about collecting based on segment ords and bulk translating these ords 
to the global ords in setNextReader and when the collection ends? 

That sounds great!  I'll try that.



 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower in the worst
 case!  I think this means it's a good option to make available?  Yes
 it has downsides (NRT reopen more costly, small added RAM usage,
 slightly slower 

[jira] [Updated] (SOLR-839) XML Query Parser support

2013-02-25 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-839:
--

Attachment: SOLR-839-object-parser.patch

This patch depends on the object parsing approach in SOLR-4351.  

This is a different approach from using Lucene's XML Query Parser.  The 
XMLQueryParser is neat and all, but the builders aren't going to work well with 
Solr's schema.

I tinkered with a SolrQueryBuilder, and that mostly works, but nested XML 
queries weren't working, so I revamped using the object parser.

 XML Query Parser support
 

 Key: SOLR-839
 URL: https://issues.apache.org/jira/browse/SOLR-839
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Affects Versions: 1.3
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 5.0

 Attachments: lucene-xml-query-parser-2.4-dev.jar, 
 SOLR-839-object-parser.patch, SOLR-839.patch


 Lucene contrib includes a query parser that is able to create the 
 full-spectrum of Lucene queries, using an XML data structure.
 This patch adds xml query parser support to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Shai Erera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585892#comment-13585892
 ] 

Shai Erera commented on LUCENE-4795:


Nice Mike.

If you want to integrate that with the current classes, all you need to do is 
to implement a partial TaxonomyReader, which resolves ordinals to CPs using the 
global ord map? Or actually make that TR the entity that's responsible to 
manage to global ordinal map, so that TR.doOpenIfChanged opens the new segments 
and updates the global map?

Since this taxonomy, at least currently, doesn't support hierarchical facets, 
you'll need to hack something as a ParallelTaxoArray, but that should be easy 
.. I think.

Is the only benefit in this approach that you don't need to manage a sidecar 
taxonomy index?

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   

Re: Sorting issue

2013-02-25 Thread Erick Erickson
You have to give us some more idea what you _are_ seeing. Just because a
field isn't multiValued doesn't mean you can sort on it, it must also not
be tokenized, so text fields that have an analysis chain are generally not
good candidates for sorting.

What do your solr logs show? What response do you get? Can you sort just by
using the browser URL?

You might want to review:
http://wiki.apache.org/solr/UsingMailingLists

Best
Erick


On Mon, Feb 25, 2013 at 5:28 AM, Chaturvedi, Puneet 
pchaturv...@columnit.com wrote:

  Hi,

 I have an issue with sorting in solr . I want to sort the solr results on
 the basis of a field which has “indexed” property set to “true” and is not
 multivalued. I am setting the sorting parameters using the “addSortField”
 method. Still I cannot see the solr results not sorted . Can you please
 guide me as to what needs to be done to solve this issue?

 ** **

 Thanks  Regards,

 Puneet Chaturvedi 



RE: Line length in Lucene/Solr code

2013-02-25 Thread Uwe Schindler
Hi,

One interesting detail: The old style terminal width of IBM PC's with 80 
columns used in the Java line length migrated in the meantime to another 
common line length: Most terminal applications have a default length of e.g. 
132 already - so I would make this number (around 130) the most common 
standard! Interestingly, the avg line length of Lucene code is already smaller!

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
 Sent: Monday, February 25, 2013 11:39 AM
 To: dev@lucene.apache.org
 Subject: Line length in Lucene/Solr code
 
 According to https://wiki.apache.org/solr/HowToContribute, Sun's code style
 conventions should be used when writing contributions for Lucene and Solr.
 Said conventions state that lines in code should be 80 characters or less,
 since they're not handled well by many terminals and tools:
 http://www.oracle.com/technetwork/java/javase/documentation/codecon
 ventions-136091.html#313
 
 A quick random inspection of the Lucene/Solr code base tells me that this
 recommendation is not followed: Out of 20 source files, only a single one
 adhered to the 80 characters/line limit and that was StorageField, which is an
 interface.
 
 I am all for a larger limit as I find that it makes Java code a lot more 
 readable.
 With current tools, Java code needs to be formatted using line breaks and
 indents (as opposed to fully dynamic tool-specific re-flow of the code). That
 formatting is dependent on a specific maximum line width to be consistent.
 
 
 With that in mind, I suggest that the code style recommendation is expanded
 with the notion that a maximum of x characters/line should be used, where x
 is something more than 80. Judging by a quick search, 120 chars seems to be
 a common choice.
 
 Regards,
 Toke Eskildsen
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-839) XML Query Parser support

2013-02-25 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585897#comment-13585897
 ] 

Erik Hatcher commented on SOLR-839:
---

With the latest patch, these queries work (borrowed from SOLR-4351's tests):

{code}
  term f=id11/term

  field f=textNow Cow/field

  prefix f=textbrow/prefix

  frange l=20 u=24mul(foo_i,2)/frange

  join from=qqq_s to=www_sid:10/join

  join from=qqq_s to=www_sterm f=id10/term/join

  lucenetext:Cow -id:1/lucene
{code}

The object parser path worked easily, but it's not as powerful as it needs to 
be.  There needs to be a way to make BooleanQuery's (without having to use the 
lucene query parser) and then, like the XMLQueryParser, do stuff with span 
queries and so on.

Maybe it's not worthwhile to have both JSON and XML query parsing as they both 
should probably use the same infrastructure.  But I would hate to see a JSON 
form of XSLT used here.  Ideally, the query tree would be defined server-side 
and lean/clean parameters would be passed in to fill in the blanks, but also 
possibly having some logic based on the values of the parameters 
(in_stock=true, would if specified add a filter for inStock:true, for example)

The XMLQParser in the last patch has xsl capability as well, so that you could 
define id.xsl to be:

{code}
xsl:stylesheet version=1.0 xmlns:xsl=http://www.w3.org/1999/XSL/Transform;
  xsl:template match=/Document
term f=idxsl:value-of select=id//term
  /xsl:template
/xsl:stylesheet
{code}

Then using defType=xmlxsl=idid=SOLR1000 a term query will be generated.  
(this is too simple of an example, as there would be other leaner/cleaner ways 
to do this exact one)


 XML Query Parser support
 

 Key: SOLR-839
 URL: https://issues.apache.org/jira/browse/SOLR-839
 Project: Solr
  Issue Type: New Feature
  Components: query parsers
Affects Versions: 1.3
Reporter: Erik Hatcher
Assignee: Erik Hatcher
 Fix For: 5.0

 Attachments: lucene-xml-query-parser-2.4-dev.jar, 
 SOLR-839-object-parser.patch, SOLR-839.patch


 Lucene contrib includes a query parser that is able to create the 
 full-spectrum of Lucene queries, using an XML data structure.
 This patch adds xml query parser support to Solr.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585899#comment-13585899
 ] 

Michael McCandless commented on LUCENE-4795:


Fixed small bug (wasn't counting ord 0); here's the same test as
before, just running on Term  Or queries:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
 MedTerm   52.70  (3.1%)   40.87  (2.0%)  
-22.5% ( -26% -  -17%)
   OrHighMed   25.54  (3.6%)   20.18  (2.4%)  
-21.0% ( -26% -  -15%)
HighTerm9.22  (4.1%)7.33  (2.4%)  
-20.4% ( -25% -  -14%)
  OrHighHigh   12.92  (3.6%)   10.41  (2.8%)  
-19.4% ( -24% -  -13%)
   OrHighLow   13.12  (3.8%)   10.61  (2.8%)  
-19.2% ( -24% -  -13%)
 LowTerm  145.94  (1.9%)  125.51  (1.6%)  
-14.0% ( -17% -  -10%)
{noformat}

Then I applied Rob's patch (base = trunk, comp = Rob's + my patch):

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
 MedTerm   52.97  (2.2%)   42.34  (1.6%)  
-20.1% ( -23% -  -16%)
   OrHighMed   25.66  (2.2%)   20.73  (1.7%)  
-19.2% ( -22% -  -15%)
  OrHighHigh   12.99  (2.4%)   10.69  (1.8%)  
-17.7% ( -21% -  -13%)
   OrHighLow   13.19  (2.3%)   10.94  (2.0%)  
-17.0% ( -20% -  -12%)
HighTerm9.30  (2.6%)7.76  (1.8%)  
-16.6% ( -20% -  -12%)
 LowTerm  146.48  (1.3%)  129.04  (0.9%)  
-11.9% ( -13% -   -9%)
{noformat}

So a wee bit faster but not much... (good!  The awesome predictive
compression from MonotonicALB doesn't hurt much).

Then I made a new collector that resolves ords after each segment from
Adrien's idea (SortedSetDocValuesCollectorMergeBySeg) -- base = same
as above, comp = new collector w/o Rob's patch:

{noformat}
TaskQPS base  StdDevQPS comp  StdDev
Pct diff
HighTerm9.29  (3.1%)7.14  (1.9%)  
-23.2% ( -27% -  -18%)
   OrHighMed   25.51  (2.7%)   19.60  (2.2%)  
-23.2% ( -27% -  -18%)
   OrHighLow   13.08  (2.8%)   10.20  (2.3%)  
-22.0% ( -26% -  -17%)
  OrHighHigh   12.89  (2.9%)   10.21  (2.6%)  
-20.8% ( -25% -  -15%)
 MedTerm   53.00  (2.7%)   43.34  (1.5%)  
-18.2% ( -21% -  -14%)
 LowTerm  145.97  (1.6%)  133.05  (0.9%)   
-8.9% ( -11% -   -6%)
{noformat}

Strangely it's not really faster ... maybe I have a bug.
Unfortunately, until we get the top K working, we can't do the
end-to-end comparison to make sure we're getting the right facet
values ...


 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
   

Re: Line length in Lucene/Solr code

2013-02-25 Thread Erick Erickson
Changed to:
lines can be greater than 80 chars long, 132 is a common limit. Try to be
reasonable for ''very'' long lines.


On Mon, Feb 25, 2013 at 9:51 AM, Uwe Schindler u...@thetaphi.de wrote:

 Hi,

 One interesting detail: The old style terminal width of IBM PC's with 80
 columns used in the Java line length migrated in the meantime to another
 common line length: Most terminal applications have a default length of
 e.g. 132 already - so I would make this number (around 130) the most
 common standard! Interestingly, the avg line length of Lucene code is
 already smaller!

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

  -Original Message-
  From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
  Sent: Monday, February 25, 2013 11:39 AM
  To: dev@lucene.apache.org
  Subject: Line length in Lucene/Solr code
 
  According to https://wiki.apache.org/solr/HowToContribute, Sun's code
 style
  conventions should be used when writing contributions for Lucene and
 Solr.
  Said conventions state that lines in code should be 80 characters or
 less,
  since they're not handled well by many terminals and tools:
  http://www.oracle.com/technetwork/java/javase/documentation/codecon
  ventions-136091.html#313
 
  A quick random inspection of the Lucene/Solr code base tells me that this
  recommendation is not followed: Out of 20 source files, only a single one
  adhered to the 80 characters/line limit and that was StorageField, which
 is an
  interface.
 
  I am all for a larger limit as I find that it makes Java code a lot more
 readable.
  With current tools, Java code needs to be formatted using line breaks and
  indents (as opposed to fully dynamic tool-specific re-flow of the code).
 That
  formatting is dependent on a specific maximum line width to be
 consistent.
 
 
  With that in mind, I suggest that the code style recommendation is
 expanded
  with the notion that a maximum of x characters/line should be used,
 where x
  is something more than 80. Judging by a quick search, 120 chars seems to
 be
  a common choice.
 
  Regards,
  Toke Eskildsen
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
  commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Updated] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4795:
---

Attachment: LUCENE-4795.patch

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, 
 pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower in the worst
 case!  I think this means it's a good option to make available?  Yes
 it has downsides (NRT reopen more costly, small added RAM usage,
 slightly slower faceting), but it's also simpler (no taxo index to
 manage).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more 

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585901#comment-13585901
 ] 

Robert Muir commented on LUCENE-4795:
-

Thanks for benchmarking: I think we should keep the monotonic compression! 
It will use significantly less RAM for this thing.

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, 
 pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)
 6.2% ( -12% -   28%)
 {noformat}
 I'm impressed that this approach is only ~24% slower in the worst
 case!  I think this means it's a good option to make available?  Yes
 it has downsides (NRT reopen more costly, small added RAM usage,
 slightly slower faceting), but it's also simpler (no taxo index to
 

[jira] [Created] (SOLR-4500) How can we integrate LDAP authentiocation with the Solr instance

2013-02-25 Thread Srividhya (JIRA)
Srividhya created SOLR-4500:
---

 Summary: How can we integrate LDAP authentiocation with the Solr 
instance
 Key: SOLR-4500
 URL: https://issues.apache.org/jira/browse/SOLR-4500
 Project: Solr
  Issue Type: Task
Affects Versions: 4.1
Reporter: Srividhya




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Line length in Lucene/Solr code

2013-02-25 Thread Erick Erickson
I'd actually never bothered to look at the line limitation, that's from
back when I started programming. Mostly I was just s happy that someone
had short-circuited the endless whether braces should be on the same line
or not discussion. G

P.S. the ''very'' is really an italic.

P.P.S. Why programmers are different than the rest. Not _only_ have I been
in the very where should the braces go discussions at various points in
my life, but there's a Wiki article that's far too long

http://en.wikipedia.org/wiki/One_True_Brace_Style#K.26R_style


On Mon, Feb 25, 2013 at 9:55 AM, Erick Erickson erickerick...@gmail.comwrote:

 Changed to:
 lines can be greater than 80 chars long, 132 is a common limit. Try to be
 reasonable for ''very'' long lines.


 On Mon, Feb 25, 2013 at 9:51 AM, Uwe Schindler u...@thetaphi.de wrote:

 Hi,

 One interesting detail: The old style terminal width of IBM PC's with
 80 columns used in the Java line length migrated in the meantime to another
 common line length: Most terminal applications have a default length of
 e.g. 132 already - so I would make this number (around 130) the most
 common standard! Interestingly, the avg line length of Lucene code is
 already smaller!

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

  -Original Message-
  From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
  Sent: Monday, February 25, 2013 11:39 AM
  To: dev@lucene.apache.org
  Subject: Line length in Lucene/Solr code
 
  According to https://wiki.apache.org/solr/HowToContribute, Sun's code
 style
  conventions should be used when writing contributions for Lucene and
 Solr.
  Said conventions state that lines in code should be 80 characters or
 less,
  since they're not handled well by many terminals and tools:
  http://www.oracle.com/technetwork/java/javase/documentation/codecon
  ventions-136091.html#313
 
  A quick random inspection of the Lucene/Solr code base tells me that
 this
  recommendation is not followed: Out of 20 source files, only a single
 one
  adhered to the 80 characters/line limit and that was StorageField,
 which is an
  interface.
 
  I am all for a larger limit as I find that it makes Java code a lot
 more readable.
  With current tools, Java code needs to be formatted using line breaks
 and
  indents (as opposed to fully dynamic tool-specific re-flow of the
 code). That
  formatting is dependent on a specific maximum line width to be
 consistent.
 
 
  With that in mind, I suggest that the code style recommendation is
 expanded
  with the notion that a maximum of x characters/line should be used,
 where x
  is something more than 80. Judging by a quick search, 120 chars seems
 to be
  a common choice.
 
  Regards,
  Toke Eskildsen
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional
  commands, e-mail: dev-h...@lucene.apache.org


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-02-25 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585903#comment-13585903
 ] 

Michael McCandless commented on LUCENE-4795:


bq. If you want to integrate that with the current classes, all you need to do 
is to implement a partial TaxonomyReader, which resolves ordinals to CPs using 
the global ord map? Or actually make that TR the entity that's responsible to 
manage to global ordinal map, so that TR.doOpenIfChanged opens the new segments 
and updates the global map?

That sounds great!

bq. Since this taxonomy, at least currently, doesn't support hierarchical 
facets, you'll need to hack something as a ParallelTaxoArray, but that should 
be easy .. I think.

OK.

I think it could be hierarchical w/o so much work, ie on reopen as it
walks the terms it should be able to easily build up the parent/child
arrays since the terms are in sorted order.  Hmm, except, with SSDV
you cannot have a term/ord that had no docs indexed.  So the
ancestor ords would not exist... hmm.  Better start
non-hierarchical.

I guess if we are non-hierarchical then we don't really need to
integrate at indexing time?  Ie, app can just add the facet values
using SortedSetDVF.

bq. Is the only benefit in this approach that you don't need to manage a 
sidecar taxonomy index?

I think so?


 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, 
 pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  

[jira] [Updated] (SOLR-4078) Allow custom naming of nodes so that a new host:port combination can take over for a previous shard.

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4078:
--

Attachment: SOLR-4078.patch

New patch - just about ready.

 Allow custom naming of nodes so that a new host:port combination can take 
 over for a previous shard.
 

 Key: SOLR-4078
 URL: https://issues.apache.org/jira/browse/SOLR-4078
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: SOLR-4078.patch, SOLR-4078.patch


 Currently we auto assign a unique node name based on the host address and 
 core name - we should let the user optionally override this so that a new 
 host address + core name combo can take over the duties of a previous 
 registered node.
 Especially useful for ec2 if you are not using elastic ips.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4332) Adding documents to SolrCloud collection broken when a node doesn't have a core for the collection

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4332:
--

  Assignee: Mark Miller
  Priority: Major  (was: Critical)
Issue Type: New Feature  (was: Bug)

 Adding documents to SolrCloud collection broken when a node doesn't have a 
 core for the collection
 --

 Key: SOLR-4332
 URL: https://issues.apache.org/jira/browse/SOLR-4332
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Eric Falcao
Assignee: Mark Miller

 In SOLR-4321, it's documented that creating a collection via API results in 
 some nodes having more than one core, while other nodes have zero cores.
 Not sure if this is desired behavior, but when a node doesn't know about a 
 core, it throws a 404 on select/update.
 Reproduction:
 -Create a 2 node SolrCloud cluster
 -Create a new collection with numShards=1. 50% of your cluster will have a 
 core for that collection.
 -Do an update or select against the node that doesn't have the core. 404
 Like I said, not sure if this is desired behavior, but I would expect a 
 cluster of nodes to be able to forward requests appropriately to nodes that 
 have a core for the collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4332) Adding documents to SolrCloud collection broken when a node doesn't have a core for the collection

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-4332.
---

Resolution: Duplicate

 Adding documents to SolrCloud collection broken when a node doesn't have a 
 core for the collection
 --

 Key: SOLR-4332
 URL: https://issues.apache.org/jira/browse/SOLR-4332
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.1
Reporter: Eric Falcao
Assignee: Mark Miller

 In SOLR-4321, it's documented that creating a collection via API results in 
 some nodes having more than one core, while other nodes have zero cores.
 Not sure if this is desired behavior, but when a node doesn't know about a 
 core, it throws a 404 on select/update.
 Reproduction:
 -Create a 2 node SolrCloud cluster
 -Create a new collection with numShards=1. 50% of your cluster will have a 
 core for that collection.
 -Do an update or select against the node that doesn't have the core. 404
 Like I said, not sure if this is desired behavior, but I would expect a 
 cluster of nodes to be able to forward requests appropriately to nodes that 
 have a core for the collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4210) if couldn't find the collection locally when searching, we should look on other nodes. one of TODOs part in SolrDispatchFilter

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4210:
--

Fix Version/s: (was: 4.0)
   (was: 4.0-BETA)
   5.0
   4.2
 Assignee: Mark Miller
 Priority: Major  (was: Critical)

  if couldn't find the collection locally when searching, we should look on 
 other nodes. one of TODOs part in SolrDispatchFilter
 ---

 Key: SOLR-4210
 URL: https://issues.apache.org/jira/browse/SOLR-4210
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-BETA, 4.0
Reporter: Po Rui
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: SOLR-4210.patch


 It only check the local collection or core  when searching, doesn't look on 
 other nodes. e.g. a cluster have 4 nodes. 1th 2th 3th contribute to 
 collection1. 2th 3th 4th contribute to collection2. now send query to 4th 
 to searching collection1 will failed. 
 It is an imperfect feature for searching. it is a TODO part in 
 SolrDispatchFilter-line 220.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4210) if couldn't find the collection locally when searching, we should look on other nodes. one of TODOs part in SolrDispatchFilter

2013-02-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585913#comment-13585913
 ] 

Mark Miller commented on SOLR-4210:
---

Thanks Po! Never saw this issue - I was just about to tackle this myself. I'll 
add some testing to your patch.

  if couldn't find the collection locally when searching, we should look on 
 other nodes. one of TODOs part in SolrDispatchFilter
 ---

 Key: SOLR-4210
 URL: https://issues.apache.org/jira/browse/SOLR-4210
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-BETA, 4.0
Reporter: Po Rui
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: SOLR-4210.patch


 It only check the local collection or core  when searching, doesn't look on 
 other nodes. e.g. a cluster have 4 nodes. 1th 2th 3th contribute to 
 collection1. 2th 3th 4th contribute to collection2. now send query to 4th 
 to searching collection1 will failed. 
 It is an imperfect feature for searching. it is a TODO part in 
 SolrDispatchFilter-line 220.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4501) MoreLikeThisComponent is misusing the mlt.count parameter

2013-02-25 Thread Kiril A. (JIRA)
Kiril A. created SOLR-4501:
--

 Summary: MoreLikeThisComponent is misusing the mlt.count parameter
 Key: SOLR-4501
 URL: https://issues.apache.org/jira/browse/SOLR-4501
 Project: Solr
  Issue Type: Bug
  Components: MoreLikeThis
Affects Versions: 4.1
Reporter: Kiril A.


Probably there is a bug on line 144 of MoreLikeThisComponent.java method 
process()

There is a call:
{code}
NamedListDocList sim = getMoreLikeThese(rb, rb.req.getSearcher(), 
rb.getResults().docList, mltcount);
{code}
The last argument (mltcount) is the number of similar documents to return for 
each result. However the signature of called method getMoreLikeThese is:

{code}
NamedListDocList getMoreLikeThese(ResponseBuilder rb,SolrIndexSearcher 
searcher, DocList docs, int flags) 
{code}

The last argument is the flags - which should contains values like 
SolrIndexSearcher.GET_SCORES and etc.

Please, could some developers confirm if this is a bug?


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Line length in Lucene/Solr code

2013-02-25 Thread Shawn Heisey
 I'd actually never bothered to look at the line limitation, that's from
 back when I started programming. Mostly I was just s happy that
 someone
 had short-circuited the endless whether braces should be on the same line
 or not discussion. G

 P.S. the ''very'' is really an italic.

 P.P.S. Why programmers are different than the rest. Not _only_ have I been
 in the very where should the braces go discussions at various points in
 my life, but there's a Wiki article that's far too long

For brace style, I believe that lucene currently uses 1TBS. Where I work,
we are expected to use Allman. Before starting here, I used 1TBS in my own
code. Allman is easiet to follow, vyt uses up a loy of vertical real
estate. I have no real opinion on whether brace style should change.  A
slightly different topic is whitespace on otherwise blank lines. There is
no consistency in Lucene here. I have no strong opinion one way or the
other, but I will note that the Eclipse format settings created by 'ant
eclipse' add the whitespace.

Getting back to the subject of this thread, I am torn. I use two programs
to edit Solr code -- vi and eclipse.  For vi (in PuTTY windows) 80 would
be best. For eclipse (in windows 7), something like 100 would be better. I
do not maximize program windows, because I like to see what's going on in
background windows. My eclipse window is large, but does not use the whole
1600x1050 area.

120 seems large, but it would work.

Thanks,
Shawn



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4783) Inconsistent results, changing based on recent previous searches (caching?)

2013-02-25 Thread William Johnson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585935#comment-13585935
 ] 

William Johnson commented on LUCENE-4783:
-

This is odd, as of this morning, I can't make it happen any more.

This problem has been on-again off-again, and currently it's off.  If it starts 
happening again, I'll see if I can find more specifics as to what in particular 
is happening at the time it happens, but for now I suppose I should close the 
issue as something I can't repeat.

 Inconsistent results, changing based on recent previous searches (caching?)
 ---

 Key: LUCENE-4783
 URL: https://issues.apache.org/jira/browse/LUCENE-4783
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
 Environment: Ubuntu Linux  Java application running under Tomcat
Reporter: William Johnson

 We have several repeatable cases where Lucene is returning different 
 candidates for the same search, on the same (static) index, depending on what 
 other searches have been run before hand.
 It appears as though Lucene is failing to find matches in some cases if they 
 have not been cached by a previous search.
 In specific (although it is happening with more than just fuzzy searches), a 
 fuzzy search on a misspelled street name returns no result.  If you then 
 search on the correctly spelled street name, and THEN return to the original 
 fuzzy query on the original incorrect spelling, you now receive the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-4783) Inconsistent results, changing based on recent previous searches (caching?)

2013-02-25 Thread William Johnson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

William Johnson resolved LUCENE-4783.
-

Resolution: Cannot Reproduce

 Inconsistent results, changing based on recent previous searches (caching?)
 ---

 Key: LUCENE-4783
 URL: https://issues.apache.org/jira/browse/LUCENE-4783
 Project: Lucene - Core
  Issue Type: Bug
Affects Versions: 4.1
 Environment: Ubuntu Linux  Java application running under Tomcat
Reporter: William Johnson

 We have several repeatable cases where Lucene is returning different 
 candidates for the same search, on the same (static) index, depending on what 
 other searches have been run before hand.
 It appears as though Lucene is failing to find matches in some cases if they 
 have not been cached by a previous search.
 In specific (although it is happening with more than just fuzzy searches), a 
 fuzzy search on a misspelled street name returns no result.  If you then 
 search on the correctly spelled street name, and THEN return to the original 
 fuzzy query on the original incorrect spelling, you now receive the result.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4457) Queries ending in question mark interpreted as wildcard

2013-02-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4457:
--

Attachment: SOLR-4457.patch

First quick patch subclassing edismax, which can be used as a plugin for 
existing versions. Will proceed with proper code as an option inside edismax 
itself.

 Queries ending in question mark interpreted as wildcard
 ---

 Key: SOLR-4457
 URL: https://issues.apache.org/jira/browse/SOLR-4457
 Project: Solr
  Issue Type: Improvement
  Components: query parsers
Reporter: Jan Høydahl
 Attachments: SOLR-4457.patch


 For many search applications, queries ending in a question mark such as {{foo 
 bar?}} would *not* mean a search for a four-letter word starting with 
 {{bar}}. Neither will it mean a literal search for a question mark.
 The query parsers should have an option to discard trailing question mark 
 before passing to analysis.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4210) if couldn't find the collection locally when searching, we should look on other nodes. one of TODOs part in SolrDispatchFilter

2013-02-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585960#comment-13585960
 ] 

Mark Miller commented on SOLR-4210:
---

Along with tests, I think we want to update this to work with binary (javabin, 
etc) as well as other urls beyond the 'select' handler.

  if couldn't find the collection locally when searching, we should look on 
 other nodes. one of TODOs part in SolrDispatchFilter
 ---

 Key: SOLR-4210
 URL: https://issues.apache.org/jira/browse/SOLR-4210
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-BETA, 4.0
Reporter: Po Rui
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: SOLR-4210.patch


 It only check the local collection or core  when searching, doesn't look on 
 other nodes. e.g. a cluster have 4 nodes. 1th 2th 3th contribute to 
 collection1. 2th 3th 4th contribute to collection2. now send query to 4th 
 to searching collection1 will failed. 
 It is an imperfect feature for searching. it is a TODO part in 
 SolrDispatchFilter-line 220.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4502) ShardHandlerFactory not initialized in CoreContainer when creating a Core manually.

2013-02-25 Thread Michael Aspetsberger (JIRA)
Michael Aspetsberger created SOLR-4502:
--

 Summary: ShardHandlerFactory not initialized in CoreContainer when 
creating a Core manually.
 Key: SOLR-4502
 URL: https://issues.apache.org/jira/browse/SOLR-4502
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Michael Aspetsberger


We are using an embedded solr server for our unit testing purposes. In our 
scenario, we create a {{CoreContainer}} using only the solr-home path, and then 
create the cores manually using a {{CoreDescriptor}}.

While the creation appears to work fine, it hits an NPE when it handles the 
search:

{quote}
Caused by: java.lang.NullPointerException
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
at 
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150)
{quote}

According to 
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3CE8A9BF60-5577-45F9-8BEA-B85616C6539D%40gmail.com%3E
 , this is due to a missing {{CoreContainer#load}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4471) Replication occurs even when a slave is already up to date.

2013-02-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585978#comment-13585978
 ] 

Raúl Grande commented on SOLR-4471:
---

I have problems with 4.2-SNAPSHOT version. My slaves doesn't replicate even 
when master's version is higher than theirs. See image here: 
http://oi50.tinypic.com/o8uzad.jpg

Why logs say Slave in sync with master when clearly isn't??

 Replication occurs even when a slave is already up to date.
 ---

 Key: SOLR-4471
 URL: https://issues.apache.org/jira/browse/SOLR-4471
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Andre Charton
Assignee: Mark Miller
  Labels: master, replication, slave, version
 Fix For: 4.2, 5.0

 Attachments: SOLR-4471.patch, SOLR-4471.patch, SOLR-4471.patch, 
 SOLR-4471_TestRefactor.diff, SOLR-4471_Tests.patch


 Scenario: master/slave replication, master delta index runs every 10 minutes, 
 slave poll interval is 10 sec.
 There was an issue SOLR-4413 - slave reads index from wrong directory, so 
 slave is full copy index from master every time, which is fixed after 
 applying this patch from 4413 (see script below).
 Now on replication the slave downloads only updated files, but slave is 
 create a new segement file and also a new version of index (generation is 
 identical with master). On next polling the slave is download the full index 
 again, because the new version slave is force a full copy.
 Problem is the new version of index on the slave after first replication.
 {noformat:apply patch SOLR-4413 script, please copy patch into patches 
 directory before useage.}
 mkdir work
 cd work
 svn co http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_1/
 cd lucene_solr_4_1
 patch -p0  ../../patches/SOLR-4413.patch
 cd solr
 ant dist
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4503) Add REST API methods to get schema information: fields, dynamic fields, and field types

2013-02-25 Thread Steve Rowe (JIRA)
Steve Rowe created SOLR-4503:


 Summary: Add REST API methods to get schema information: fields, 
dynamic fields, and field types
 Key: SOLR-4503
 URL: https://issues.apache.org/jira/browse/SOLR-4503
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Affects Versions: 4.1
Reporter: Steve Rowe
Assignee: Steve Rowe


Add REST methods that provide properties for fields, dynamic fields, and field 
types, using paths:

/solr/(corename)/schema/fields
/solr/(corename)/schema/fields/fieldname

/solr/(corename)/schema/dynamicfields
/solr/(corename)/schema/dynamicfields/pattern

/solr/(corename)/schema/fieldtypes
/solr/(corename)/schema/fieldtypes/typename 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4503) Add REST API methods to get schema information: fields, dynamic fields, and field types

2013-02-25 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-4503:
-

Attachment: SOLR-4503.patch

Patch implementing the idea.

No (functioning) tests yet.

I've started the example server using the default solr home and the multicore 
solr home, and requests to all methods are functional from curl.

 Add REST API methods to get schema information: fields, dynamic fields, and 
 field types
 ---

 Key: SOLR-4503
 URL: https://issues.apache.org/jira/browse/SOLR-4503
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Affects Versions: 4.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Attachments: SOLR-4503.patch


 Add REST methods that provide properties for fields, dynamic fields, and 
 field types, using paths:
 /solr/(corename)/schema/fields
 /solr/(corename)/schema/fields/fieldname
 /solr/(corename)/schema/dynamicfields
 /solr/(corename)/schema/dynamicfields/pattern
 /solr/(corename)/schema/fieldtypes
 /solr/(corename)/schema/fieldtypes/typename 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4210) if couldn't find the collection locally when searching, we should look on other nodes. one of TODOs part in SolrDispatchFilter

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4210:
--

Attachment: SOLR-4210.patch

Here is a first pass at something more generic - it attempts to forward for all 
appropriate requests. Also has a simple test.

  if couldn't find the collection locally when searching, we should look on 
 other nodes. one of TODOs part in SolrDispatchFilter
 ---

 Key: SOLR-4210
 URL: https://issues.apache.org/jira/browse/SOLR-4210
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-BETA, 4.0
Reporter: Po Rui
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: SOLR-4210.patch, SOLR-4210.patch


 It only check the local collection or core  when searching, doesn't look on 
 other nodes. e.g. a cluster have 4 nodes. 1th 2th 3th contribute to 
 collection1. 2th 3th 4th contribute to collection2. now send query to 4th 
 to searching collection1 will failed. 
 It is an imperfect feature for searching. it is a TODO part in 
 SolrDispatchFilter-line 220.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4502) ShardHandlerFactory not initialized in CoreContainer when creating a Core manually.

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-4502:
-

Assignee: Mark Miller

 ShardHandlerFactory not initialized in CoreContainer when creating a Core 
 manually.
 ---

 Key: SOLR-4502
 URL: https://issues.apache.org/jira/browse/SOLR-4502
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Michael Aspetsberger
Assignee: Mark Miller
  Labels: NPE

 We are using an embedded solr server for our unit testing purposes. In our 
 scenario, we create a {{CoreContainer}} using only the solr-home path, and 
 then create the cores manually using a {{CoreDescriptor}}.
 While the creation appears to work fine, it hits an NPE when it handles the 
 search:
 {quote}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
   at 
 org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150)
 {quote}
 According to 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3CE8A9BF60-5577-45F9-8BEA-B85616C6539D%40gmail.com%3E
  , this is due to a missing {{CoreContainer#load}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4502) ShardHandlerFactory not initialized in CoreContainer when creating a Core manually.

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4502:
--

Fix Version/s: 5.0
   4.2

 ShardHandlerFactory not initialized in CoreContainer when creating a Core 
 manually.
 ---

 Key: SOLR-4502
 URL: https://issues.apache.org/jira/browse/SOLR-4502
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.1
Reporter: Michael Aspetsberger
Assignee: Mark Miller
  Labels: NPE
 Fix For: 4.2, 5.0


 We are using an embedded solr server for our unit testing purposes. In our 
 scenario, we create a {{CoreContainer}} using only the solr-home path, and 
 then create the cores manually using a {{CoreDescriptor}}.
 While the creation appears to work fine, it hits an NPE when it handles the 
 search:
 {quote}
 Caused by: java.lang.NullPointerException
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
   at 
 org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:150)
 {quote}
 According to 
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201301.mbox/%3CE8A9BF60-5577-45F9-8BEA-B85616C6539D%40gmail.com%3E
  , this is due to a missing {{CoreContainer#load}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4471) Replication occurs even when a slave is already up to date.

2013-02-25 Thread Amit Nithian (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586019#comment-13586019
 ] 

Amit Nithian commented on SOLR-4471:


I'm curious about something. What does:
1) http://localhost:17045/solr/replication?command=indexversion yield
2) http://localhost:your slave port/solr/replication?command=details yield

I have noticed that what you see on the UI vs what you see in the indexversion 
and details sometimes differ and I wonder if that is a culprit here? Does #1,#2 
jive with the versions that you see in the UI?

 Replication occurs even when a slave is already up to date.
 ---

 Key: SOLR-4471
 URL: https://issues.apache.org/jira/browse/SOLR-4471
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.1
Reporter: Andre Charton
Assignee: Mark Miller
  Labels: master, replication, slave, version
 Fix For: 4.2, 5.0

 Attachments: SOLR-4471.patch, SOLR-4471.patch, SOLR-4471.patch, 
 SOLR-4471_TestRefactor.diff, SOLR-4471_Tests.patch


 Scenario: master/slave replication, master delta index runs every 10 minutes, 
 slave poll interval is 10 sec.
 There was an issue SOLR-4413 - slave reads index from wrong directory, so 
 slave is full copy index from master every time, which is fixed after 
 applying this patch from 4413 (see script below).
 Now on replication the slave downloads only updated files, but slave is 
 create a new segement file and also a new version of index (generation is 
 identical with master). On next polling the slave is download the full index 
 again, because the new version slave is force a full copy.
 Problem is the new version of index on the slave after first replication.
 {noformat:apply patch SOLR-4413 script, please copy patch into patches 
 directory before useage.}
 mkdir work
 cd work
 svn co http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_1/
 cd lucene_solr_4_1
 patch -p0  ../../patches/SOLR-4413.patch
 cd solr
 ant dist
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Line length in Lucene/Solr code

2013-02-25 Thread Chris Hostetter

: I am all for a larger limit as I find that it makes Java code a lot more
: readable. With current tools, Java code needs to be formatted using line

Aim for 80 chars, but don't shoehorn things if they are 
 more readable on a single long line 

   -Hoss'ss Law of Code Line Lengths



-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned SOLR-4373:
--

Assignee: Mark Miller  (was: Hoss Man)

Mark: i really don't have anything to offer on this issue beyond the comments i 
already posted ... based on my testing it really _seems_ like this is caused by 
SOLR-4063, but that may be totally off base since aledandre couldn't make the 
problem go away using hte workarround i thought i found (ie: forcing single 
threaded core init)

since i couldn't reproduce a similar collision between multicores with a simple 
stopwords file loaded from the classpath, it also seems likeley that this 
relates to SPI loading: but since i don't really understand at all how 
NamedSPILoader works in a multiclassloader application (and since my 
experiements in forcing synchronization on it didn't solve the problem for me) 
i'm at a dead in there as well



 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Assignee: Mark Miller
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-4373:
-

Assignee: (was: Mark Miller)

 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586097#comment-13586097
 ] 

Mark Miller commented on SOLR-4373:
---

Oh, I don't plan on looking into it. Just assigned it to so that it wasn't 
forgotten and you seemed to have some knowledge in this area. I don't really 
have a need for more than a single lib dir ever pretty much.

 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4503) Add REST API methods to get schema information: fields, dynamic fields, and field types

2013-02-25 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586112#comment-13586112
 ] 

Steve Rowe commented on SOLR-4503:
--

The patch adds two dependencies: Restlet and the Restlet servlet extension.  
All REST methods are implemented as Restlet ServerResource subclasses, which 
delegate to new self-reporting methods on IndexField and FieldType, the 
implementation of which was inspired by/stolen from LukeRequestHandler.

SolrDispatchFilter figures out the core, creates a SolrRequest and a 
SolrResponse, sets them on SolrRequestInfo's thread local, then passes the 
request (via filter chaining or request forwarding) to the Restlet servlet 
defined to handle schema requests.  Based on the URL path, the Restlet 
servlet's router then sends the request to the appropriate ServerResource 
subclass, where the response is filled in.

There is no RequestHandler involved in servicing these requests.

I've turned off Restlet's content negotiation facilities in favor of using 
Solr's wt parameter to specify the ResponseWriter.

At present, both GET and HEAD requests work for all six requests.  (Restlet 
uses GET methods to service HEAD requests, so there was very little coding 
required to do this.)


 Add REST API methods to get schema information: fields, dynamic fields, and 
 field types
 ---

 Key: SOLR-4503
 URL: https://issues.apache.org/jira/browse/SOLR-4503
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Affects Versions: 4.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Attachments: SOLR-4503.patch


 Add REST methods that provide properties for fields, dynamic fields, and 
 field types, using paths:
 /solr/(corename)/schema/fields
 /solr/(corename)/schema/fields/fieldname
 /solr/(corename)/schema/dynamicfields
 /solr/(corename)/schema/dynamicfields/pattern
 /solr/(corename)/schema/fieldtypes
 /solr/(corename)/schema/fieldtypes/typename 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Alexandre Rafalovitch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586122#comment-13586122
 ] 

Alexandre Rafalovitch commented on SOLR-4373:
-

I have just rechecked and I still have the problem with coreLoadThreads=1 
setting under Solr 4.1. Every time I start Solr a different subset of cores 
fails. I think reloading a core triggers this as well, though it is harder to 
check.

I have the whole set of examples structured around getting this to work (each 
example is a separate core). Is there something I can do to help 
troubleshooting this? I haven't tried working with Solr source yet, but I am a 
Java developer and can dig around if there is some sort of information at where 
library references are stored.

 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586128#comment-13586128
 ] 

Uwe Schindler commented on SOLR-4373:
-

Hoss: The concurrency issue in NamedSPILoader can indeed be solved by making 
the reload method synchonized. The services field is volatile, so readers will 
in any case see the correct value. The thread safety problem is *inside* this 
method.

 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4481) SwitchQParserPlugin

2013-02-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586135#comment-13586135
 ] 

Commit Tag Bot commented on SOLR-4481:
--

[trunk commit] Chris M. Hostetter
http://svn.apache.org/viewvc?view=revisionrevision=1449809

SOLR-4481: SwitchQParserPlugin registered by default as 'switch'


 SwitchQParserPlugin
 ---

 Key: SOLR-4481
 URL: https://issues.apache.org/jira/browse/SOLR-4481
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-4481.patch


 Inspired by a conversation i had with someone on IRC a while back about using 
 append fq params + local params to create custom request params, it 
 occurred to me that it would be handy to have a switch qparser that could 
 be configured with some set of fixed switch case localparams that it would 
 delegate too based on it's input string.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4771) Query-time join collectors could maybe be more efficient

2013-02-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586137#comment-13586137
 ] 

Robert Muir commented on LUCENE-4771:
-

Thanks for updating Martijn! I plan to look at this later tonight and work on 
pulling out the BitsFilteredTermsEnum and making it more efficient. After that, 
I think we should revisit the intersection (I started with something 
ultra-simple here) to make sure its optimal too.

Somehow actually we should try to come up with a standard benchmark (luceneutil 
or similar) so that we can test the approach for the single-valued case there 
too. My intuition says I think it can be a win in both cases.

 Query-time join collectors could maybe be more efficient
 

 Key: LUCENE-4771
 URL: https://issues.apache.org/jira/browse/LUCENE-4771
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/join
Reporter: Robert Muir
 Attachments: LUCENE-4771_prototype.patch, 
 LUCENE-4771-prototype.patch, LUCENE-4771_prototype_without_bug.patch


 I was looking @ these collectors on LUCENE-4765 and I noticed:
 * SingleValued collector (SV) pulls FieldCache.getTerms and adds the bytes to 
 a bytesrefhash per-collect.
 * MultiValued  collector (MV) pulls FieldCache.getDocTermsOrds, but doesnt 
 use the ords, just looks up each value and adds the bytes per-collect.
 I think instead its worth investigating if SV should use getTermsIndex, and 
 both collectors just collect-up their per-segment ords in something like a 
 BitSet[maxOrd]. 
 When asked for the terms at the end in getCollectorTerms(), they could merge 
 these into one BytesRefHash.
 Of course, if you are going to turn around and execute the query against the 
 same searcher anyway (is this the typical case?), this could even be more 
 efficient: No need to hash or instantiate all the terms in memory, we could 
 do postpone the lookups to SeekingTermSetTermsEnum.accept()/nextSeekTerm() i 
 think... somehow :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Hoss Man (JIRA)
Hoss Man created LUCENE-4796:


 Summary: NamedSPILoader.reload needs to be synchronized
 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0


Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
not thread safe: it reads from this.services at the beginging of hte method, 
makes additions based on the method input, and then overwrites this.services at 
the end of the method.  if the method is called by two threads concurrently, 
the entries added by threadB could be lost if threadA enters the method before 
threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586144#comment-13586144
 ] 

Uwe Schindler commented on LUCENE-4796:
---

Hoss: I agree!

The concurrency issue in NamedSPILoader can indeed be solved by making the 
reload method synchonized. The services field is volatile, so readers will in 
any case see the correct value, otherwise all methods would need to be 
synchronized. By using a voltile, we only need to synchronize this single 
method.

 NamedSPILoader.reload needs to be synchronized
 --

 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0


 Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
 not thread safe: it reads from this.services at the beginging of hte method, 
 makes additions based on the method input, and then overwrites this.services 
 at the end of the method.  if the method is called by two threads 
 concurrently, the entries added by threadB could be lost if threadA enters 
 the method before threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-4797) Fix remaining Lucene/Solr Javadocs issue

2013-02-25 Thread Uwe Schindler (JIRA)
Uwe Schindler created LUCENE-4797:
-

 Summary: Fix remaining Lucene/Solr Javadocs issue
 Key: LUCENE-4797
 URL: https://issues.apache.org/jira/browse/LUCENE-4797
 Project: Lucene - Core
  Issue Type: Bug
  Components: general/javadocs
Affects Versions: 4.1
Reporter: Uwe Schindler


Java 8 has a new feature (enabled by default): http://openjdk.java.net/jeps/172

It fails the build on:
- incorrect links (@see, @link,...)
- incorrect HTML entities
- invalid HTML in general

Thanks to our linter written in HTMLTidy and Python, most of the bugs are 
already solved in our source code, but the Oracle Linter finds some more 
problems, our linter does not:
- missing escapes 
- invalid entities

Unfortunately the versions of JDK8 released up to today have a bug, making 
optional closing tags (which are valid HTML4), like /p, mandatory. This will 
be fixed in b78.

Currently there is another bug in the Oracle javadocs tool (it fails to copy 
doc-files folders), but this is under investigation at the moment.

We should clean up our javadocs, so the pass the new JDK8 javadocs tool with 
build 78+. Maybe we can put our own linter out of service, once we rely on Java 
8 :-)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Alexandre Rafalovitch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586175#comment-13586175
 ] 

Alexandre Rafalovitch commented on SOLR-4373:
-

Isn't that for SPIs only? Does that cover TokenFactories, etc? I think the 
libraries actually use SolrResourceLoader#reloadLuceneSPI instead.

I wonder if the problem is related to SolrResourceLoader#createClassLoader:
{code:java}
if ( null == parent ) {
  parent = Thread.currentThread().getContextClassLoader();
}
{code}

How does this work with multiple threads?

 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586182#comment-13586182
 ] 

Hoss Man commented on SOLR-4373:


Alex:

I just noticed something in one of your earlier comments...

bq. I am unable to get the problem go away by using coreLoadThreads=1: cores 
adminPath=/admin/cores coreLoadThreads=1

...the coreLoadThreads option should be on the {{solr/}} element, not the 
{{cores/}} element, can you please test that again?

Other comments..

bq. Is there something I can do to help troubleshooting this? I haven't tried 
working with Solr source yet, but I am a Java developer and can dig around if 
there is some sort of information at where library references are stored.

primarily we build up a ClassLoader per SolrCore in SOlrResourceLoader, each of 
which hangs off of the parent classloader for the webapp -- but the use of SPI 
in lucene complicates things in ways i still don't fully understand.

bq. Isn't that for SPIs only? Does that cover TokenFactories

Yes, many of the various factories in Solr are handled using SPI now (take a 
look at SolrResourceLoader.reloadLuceneSPI())


 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4481) SwitchQParserPlugin

2013-02-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-4481.


   Resolution: Fixed
Fix Version/s: 5.0
   4.2

I went ahead and committed a version using the following syntax...

{noformat}
{!switch case=XXX case.foo=YYY case.bar=ZZZ default=QQQ}foo
{noformat}

Committed revision 1449809.
Committed revision 1449823.


 SwitchQParserPlugin
 ---

 Key: SOLR-4481
 URL: https://issues.apache.org/jira/browse/SOLR-4481
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0

 Attachments: SOLR-4481.patch


 Inspired by a conversation i had with someone on IRC a while back about using 
 append fq params + local params to create custom request params, it 
 occurred to me that it would be handy to have a switch qparser that could 
 be configured with some set of fixed switch case localparams that it would 
 delegate too based on it's input string.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4481) SwitchQParserPlugin

2013-02-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586198#comment-13586198
 ] 

Commit Tag Bot commented on SOLR-4481:
--

[branch_4x commit] Chris M. Hostetter
http://svn.apache.org/viewvc?view=revisionrevision=1449823

SOLR-4481: SwitchQParserPlugin registered by default as 'switch' (merge 
r1449809)


 SwitchQParserPlugin
 ---

 Key: SOLR-4481
 URL: https://issues.apache.org/jira/browse/SOLR-4481
 Project: Solr
  Issue Type: New Feature
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0

 Attachments: SOLR-4481.patch


 Inspired by a conversation i had with someone on IRC a while back about using 
 append fq params + local params to create custom request params, it 
 occurred to me that it would be handy to have a switch qparser that could 
 be configured with some set of fixed switch case localparams that it would 
 delegate too based on it's input string.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Alexandre Rafalovitch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586217#comment-13586217
 ] 

Alexandre Rafalovitch commented on SOLR-4373:
-

Ok, adding the flag on solr element seems to have fixed the problem. My bad. 
Does it mean we know where the problem is?

However, on the other point, I don't see NamedSPILoader being called from 
SolrResourceLoader.reloadLuceneSPI(). Rather, I see AnalysisSPILoader. Not sure 
if the difference is significant.


 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4210) if couldn't find the collection locally when searching, we should look on other nodes. one of TODOs part in SolrDispatchFilter

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4210:
--

Attachment: SOLR-4210.patch

New patch: tightened up, more testing.

  if couldn't find the collection locally when searching, we should look on 
 other nodes. one of TODOs part in SolrDispatchFilter
 ---

 Key: SOLR-4210
 URL: https://issues.apache.org/jira/browse/SOLR-4210
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-BETA, 4.0
Reporter: Po Rui
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: SOLR-4210.patch, SOLR-4210.patch, SOLR-4210.patch


 It only check the local collection or core  when searching, doesn't look on 
 other nodes. e.g. a cluster have 4 nodes. 1th 2th 3th contribute to 
 collection1. 2th 3th 4th contribute to collection2. now send query to 4th 
 to searching collection1 will failed. 
 It is an imperfect feature for searching. it is a TODO part in 
 SolrDispatchFilter-line 220.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-25 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4480:
--

Attachment: SOLR-4480.patch

Updated patch, also catching exceptions in the case where the +/- comes after a 
leading whitespace.

What do people think about the solution? Plan to commit soon.

 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-4480.patch, SOLR-4480.patch


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4210) Requests to a Collection that does not exist on the receiving node should be proxied to a suitable node.

2013-02-25 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-4210:
--

Summary: Requests to a Collection that does not exist on the receiving node 
should be proxied to a suitable node.  (was:  if couldn't find the collection 
locally when searching, we should look on other nodes. one of TODOs part in 
SolrDispatchFilter)

 Requests to a Collection that does not exist on the receiving node should be 
 proxied to a suitable node.
 

 Key: SOLR-4210
 URL: https://issues.apache.org/jira/browse/SOLR-4210
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.0-BETA, 4.0
Reporter: Po Rui
Assignee: Mark Miller
 Fix For: 4.2, 5.0

 Attachments: SOLR-4210.patch, SOLR-4210.patch, SOLR-4210.patch


 It only check the local collection or core  when searching, doesn't look on 
 other nodes. e.g. a cluster have 4 nodes. 1th 2th 3th contribute to 
 collection1. 2th 3th 4th contribute to collection2. now send query to 4th 
 to searching collection1 will failed. 
 It is an imperfect feature for searching. it is a TODO part in 
 SolrDispatchFilter-line 220.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586239#comment-13586239
 ] 

Uwe Schindler commented on SOLR-4373:
-

This is how it is called, just indirect.

 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in solrconfig.xml seem to wipe out/override the 
 definitions in previous cores.
 The exception (for the earlier core) is:
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
   at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
   at 
 org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
 [schema.xml] analyzer/filter: Error loading class 
 'solr.ICUFoldingFilterFactory'
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
   at 
 org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
   at 
 org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)
 The full replication case is attached.
 If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-3843:
--

Attachment: SOLR-3843.patch

Here's the start to a patch (I havent tested the build with it or looked at 
maven and so on).

This adds the codecs jar and enables SchemaCodecFactory by default: so the 
format for postings lists and docvalues can be customized easily in the 
fieldtype.

I didnt want to turn this factory on by default because of SOLR-4417, but Mark 
fixed that.

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4492) Please add support for Collection API CREATE method to evenly distribute leader roles among instances

2013-02-25 Thread Tim Vaillancourt (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586253#comment-13586253
 ] 

Tim Vaillancourt commented on SOLR-4492:


I like the logic mentioned. As an Ops guy I'm basically looking for a least 
leader algo, ie: make the winner the node with the least leader roles, 
otherwise random.

 Please add support for Collection API CREATE method to evenly distribute 
 leader roles among instances
 -

 Key: SOLR-4492
 URL: https://issues.apache.org/jira/browse/SOLR-4492
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Tim Vaillancourt
Priority: Minor

 Currently in SolrCloud 4.1, a CREATE call to the Collection API will cause 
 the server receiving the CREATE call to become the leader of all shards.
 I would like to ask for the ability for the CREATE call to evenly distribute 
 the leader role across all instances, ie: if I create 3 shards over 3 SOLR 
 4.1 instances, each instance/node would only be the leader of 1 shard.
 This would be logically consistent with the way replicas are randomly 
 distributed by this same call across instances/nodes.
 Currently, this CREATE call will cause the server receiving the call to 
 become the leader of 3 shards.
 curl -v 
 'http://HOST:8983/solr/admin/collections?action=CREATEname=testnumShards=3replicationFactor=2maxShardsPerNode=2'
 PS: Thank you SOLR developers for your contributions!
 Tim Vaillancourt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586257#comment-13586257
 ] 

Uwe Schindler commented on LUCENE-4796:
---

We have to fix the same issue in AnalysisSPILoader which is (unfortunately) a 
different class with some code duplication (in analysis/common module).

 NamedSPILoader.reload needs to be synchronized
 --

 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0


 Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
 not thread safe: it reads from this.services at the beginging of hte method, 
 makes additions based on the method input, and then overwrites this.services 
 at the end of the method.  if the method is called by two threads 
 concurrently, the entries added by threadB could be lost if threadA enters 
 the method before threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4771) Query-time join collectors could maybe be more efficient

2013-02-25 Thread Martijn van Groningen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586258#comment-13586258
 ] 

Martijn van Groningen commented on LUCENE-4771:
---

I also think that this approach will improve single-valued joining too. Just 
collecting the ordinals and fetching the actual terms on the fly without 
hashing should be much faster.

Just wondering how to make a standard benchmark. Usually when I test joining I 
generate random docs. Simple docs with with random `from` values and docs with 
matching `to` values and also have different `from` to `to` docs ratios. Maybe 
we can use the stackoverflow dataset (join questions and answers) as test 
dataset with relational like data. Not sure if this is possible licence wise. 

 Query-time join collectors could maybe be more efficient
 

 Key: LUCENE-4771
 URL: https://issues.apache.org/jira/browse/LUCENE-4771
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/join
Reporter: Robert Muir
 Attachments: LUCENE-4771_prototype.patch, 
 LUCENE-4771-prototype.patch, LUCENE-4771_prototype_without_bug.patch


 I was looking @ these collectors on LUCENE-4765 and I noticed:
 * SingleValued collector (SV) pulls FieldCache.getTerms and adds the bytes to 
 a bytesrefhash per-collect.
 * MultiValued  collector (MV) pulls FieldCache.getDocTermsOrds, but doesnt 
 use the ords, just looks up each value and adds the bytes per-collect.
 I think instead its worth investigating if SV should use getTermsIndex, and 
 both collectors just collect-up their per-segment ords in something like a 
 BitSet[maxOrd]. 
 When asked for the terms at the end in getCollectorTerms(), they could merge 
 these into one BytesRefHash.
 Of course, if you are going to turn around and execute the query against the 
 same searcher anyway (is this the typical case?), this could even be more 
 efficient: No need to hash or instantiate all the terms in memory, we could 
 do postpone the lookups to SeekingTermSetTermsEnum.accept()/nextSeekTerm() i 
 think... somehow :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4796:
--

Attachment: LUCENE-4796.patch

 NamedSPILoader.reload needs to be synchronized
 --

 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4796.patch


 Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
 not thread safe: it reads from this.services at the beginging of hte method, 
 makes additions based on the method input, and then overwrites this.services 
 at the end of the method.  if the method is called by two threads 
 concurrently, the entries added by threadB could be lost if threadA enters 
 the method before threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4503) Add REST API methods to get schema information: fields, dynamic fields, and field types

2013-02-25 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586264#comment-13586264
 ] 

Steve Rowe commented on SOLR-4503:
--

I'm interested in what other people think of using Restlet in Solr - this 
issue, in part, is about exploring *how* to do that.

Restlet brings some baggage:

* Non-RequestHandler-based Restlet actions aren't available (as currently 
written anyway) via EmbeddedSolrServer, which only knows how to deal with 
requests that have RequestHandlers.
* Restlet's artifacts aren't deployed to Maven Central - instead, they host 
their own Maven repository.  I was worried that having dependencies drawn from 
3rd party Maven repositories would cause trouble, so I deployed to the ASF 
staging repository a fake Solr release including the two Restlet dependencies 
in the Solr core POM, and the quality checks performed as part of closing  
didn't flag this as a problem, so I think using Restlet will not block Lucene 
or Solr from deploying to Maven Central.

Restlet should make some things easier, though, e.g. the PUT and DELETE methods 
are usable.

 Add REST API methods to get schema information: fields, dynamic fields, and 
 field types
 ---

 Key: SOLR-4503
 URL: https://issues.apache.org/jira/browse/SOLR-4503
 Project: Solr
  Issue Type: Sub-task
  Components: Schema and Analysis
Affects Versions: 4.1
Reporter: Steve Rowe
Assignee: Steve Rowe
 Attachments: SOLR-4503.patch


 Add REST methods that provide properties for fields, dynamic fields, and 
 field types, using paths:
 /solr/(corename)/schema/fields
 /solr/(corename)/schema/fields/fieldname
 /solr/(corename)/schema/dynamicfields
 /solr/(corename)/schema/dynamicfields/pattern
 /solr/(corename)/schema/fieldtypes
 /solr/(corename)/schema/fieldtypes/typename 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586265#comment-13586265
 ] 

Robert Muir commented on SOLR-3843:
---

Smoketesting passes with this patch. But i am not sure if anything should/needs 
to be changed in maven.



 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586301#comment-13586301
 ] 

Hoss Man commented on LUCENE-4796:
--

bq. We have to fix the same issue in AnalysisSPILoader which is (unfortunately) 
a different class with some code duplication

A... that could totally explain why my naive attempt at fixing SOLR-4373 a 
while back didn't seem to work -- i was only aware of NamedSPILoader but did ad 
hock testing using analyzer factories.

Uwe: looking at your patch, one thing that jumps out at me is that 
AnalysisSPILoader seems to have another exist bug that may also cause some 
similar problems, regardless of thread safety...

{code}
  public synchronized void reload(ClassLoader classloader) {
final SPIClassIteratorS loader = SPIClassIterator.get(clazz, classloader);
final LinkedHashMapString,Class? extends S services = new 
LinkedHashMapString,Class? extends S();
{code}

...shouldn't that LinkedHashMap be initialized with a copy of this.services 
(just like in NamedSPILoader.reload) so successive calls to reload(...) don't 
forget services that have already been added?

(if you only call reload on child classloaders, then i imagine this wouldn't 
cause any problems, but with independent sibling classloaders it seems like 
calls stacks along the lines of..

{noformat}
analysisloader = new AnalysisSPILoader(Foo.class, parentClassLoader);
analysisloader.reload(childAClassLoader); 
analysisloader.reload(childBClassLoader);
{noformat}

...would cause the loader to forget about any services it found in 
childAClassloader)

 NamedSPILoader.reload needs to be synchronized
 --

 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4796.patch


 Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
 not thread safe: it reads from this.services at the beginging of hte method, 
 makes additions based on the method input, and then overwrites this.services 
 at the end of the method.  if the method is called by two threads 
 concurrently, the entries added by threadB could be lost if threadA enters 
 the method before threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Line length in Lucene/Solr code

2013-02-25 Thread David Smiley (@MITRE.org)
If 120 is the new maximum, is it also the generally recommended
reflow/line-break for javadocs?  Or should that be 100, or stay at 80?  I
suggest 100.

~ David



-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Line-length-in-Lucene-Solr-code-tp4042685p4042849.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4796:
--

Attachment: LUCENE-4796.patch

Thanks Hoss,
this is indeed another bug. Too stupid! - copypaste error from the earlier 
days. In my opinion, thecode duplication is horrible, but AnalysisFactories 
unfortunately dont inplement NamedSPI, so have no name.

 NamedSPILoader.reload needs to be synchronized
 --

 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4796.patch, LUCENE-4796.patch


 Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
 not thread safe: it reads from this.services at the beginging of hte method, 
 makes additions based on the method input, and then overwrites this.services 
 at the end of the method.  if the method is called by two threads 
 concurrently, the entries added by threadB could be lost if threadA enters 
 the method before threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-4796:
-

Assignee: Uwe Schindler  (was: Hoss Man)

 NamedSPILoader.reload needs to be synchronized
 --

 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Uwe Schindler
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4796.patch, LUCENE-4796.patch


 Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
 not thread safe: it reads from this.services at the beginging of hte method, 
 makes additions based on the method input, and then overwrites this.services 
 at the end of the method.  if the method is called by two threads 
 concurrently, the entries added by threadB could be lost if threadA enters 
 the method before threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586345#comment-13586345
 ] 

Yonik Seeley commented on SOLR-4480:


bq. What should be the logical behavior of a single + or -? If eDisMax 
discovers it as one of many words, it is treated as whitespace

It shouldn't be, and a quick test shows that it is treated as a literal.
Are you forgetting to URL-encode the + when trying it from a browser, or 
perhaps the analysis of the default field is removing the character because 
it's not alphanumeric?

Try this:
http://localhost:8983/solr/select?debug=querydefType=edismaxdf=foo_sq=hello+%2b+there

 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-4480.patch, SOLR-4480.patch


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4490) add support for multivalued docvalues

2013-02-25 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586349#comment-13586349
 ] 

Adrien Grand commented on SOLR-4490:


+1 

 add support  for multivalued docvalues
 --

 Key: SOLR-4490
 URL: https://issues.apache.org/jira/browse/SOLR-4490
 Project: Solr
  Issue Type: New Feature
Reporter: Robert Muir
 Attachments: SOLR-4490.patch, SOLR-4490.patch


 exposing LUCENE-4765 essentially. 
 I think we don't need any new options, it just means doing the right thing 
 when someone has docValues=true and multivalued=true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586368#comment-13586368
 ] 

Jan Høydahl commented on SOLR-4480:
---

So let's take the String field example. A single %2B crashes the Lucene query 
parser, and since we just pass it straight through it crashes eDisMax too.

For the Lucene parser, it crashes for all query strings *ending* in a single +
http://localhost:8983/solr/select?debug=queryq=foo%20%2B
but not for queries where there is a whitespace after the +
http://localhost:8983/solr/select?debug=queryq=%2B%20foo

eDismax is a bit different. It does not crash on ending + but it swallows it:
http://localhost:8983/solr/select?debug=querydefType=edismaxdf=foo_sq=%2B%20hello%20%2B

This is due to line 700-703 being too quick at guessing that the + or - means 
MUST or NOT
{code}
  if (ch=='+' || ch=='-') {
clause.must = ch;
pos++;
  }
{code}

I'm ok with saying that a single + or - should mean literal matching (given 
that field type supports it), and thus we translate '+'-'\+'. But then we 
should do the same for the + or - at the end of a query string.

 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-4480.patch, SOLR-4480.patch


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586368#comment-13586368
 ] 

Jan Høydahl edited comment on SOLR-4480 at 2/25/13 10:22 PM:
-

So let's take the String field example. A single %2B crashes the Lucene query 
parser, and since we just pass it straight through it crashes eDisMax too.

For the Lucene parser, it crashes for all query strings *ending* in a single +
http://localhost:8983/solr/select?debug=queryq=foo%20%2B
but not for queries where there is a whitespace after the +
http://localhost:8983/solr/select?debug=queryq=%2B%20foo

eDismax is a bit different. It does not crash on ending + but it swallows it:
http://localhost:8983/solr/select?debug=querydefType=edismaxdf=foo_sq=%2B%20hello%20%2B

This is probably due to line 700-703 being too quick at guessing that the + or 
- means MUST or NOT
{code}
  if (ch=='+' || ch=='-') {
clause.must = ch;
pos++;
  }
{code}

I'm ok with saying that a single plus or minus should mean literal matching 
(given that field type supports it), and thus we add escaping. But then we 
should do the same at the end of a query string.

  was (Author: janhoy):
So let's take the String field example. A single %2B crashes the Lucene 
query parser, and since we just pass it straight through it crashes eDisMax too.

For the Lucene parser, it crashes for all query strings *ending* in a single +
http://localhost:8983/solr/select?debug=queryq=foo%20%2B
but not for queries where there is a whitespace after the +
http://localhost:8983/solr/select?debug=queryq=%2B%20foo

eDismax is a bit different. It does not crash on ending + but it swallows it:
http://localhost:8983/solr/select?debug=querydefType=edismaxdf=foo_sq=%2B%20hello%20%2B

This is due to line 700-703 being too quick at guessing that the + or - means 
MUST or NOT
{code}
  if (ch=='+' || ch=='-') {
clause.must = ch;
pos++;
  }
{code}

I'm ok with saying that a single + or - should mean literal matching (given 
that field type supports it), and thus we translate '+'-'\+'. But then we 
should do the same for the + or - at the end of a query string.
  
 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-4480.patch, SOLR-4480.patch


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-3843:
-

Attachment: SOLR-3843.patch

bq. Smoketesting passes with this patch. But i am not sure if anything 
should/needs to be changed in maven.

The attached patch is Robert's with the addition of a dependency from the Solr 
webapp module on the lucene-codecs jar.  With this change, when the war is 
built by Maven, the lucene-codecs jar is put in the same place as when the war 
is built by the Ant build: under WEB-INF/lib/.

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus

2013-02-25 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586380#comment-13586380
 ] 

Yonik Seeley commented on SOLR-4480:


bq. I'm ok with saying that a single plus or minus should mean literal matching 
(given that field type supports it), and thus we add escaping. But then we 
should do the same at the end of a query string.

Correct.

 EDisMax parser blows up with query containing single plus or minus
 --

 Key: SOLR-4480
 URL: https://issues.apache.org/jira/browse/SOLR-4480
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Reporter: Fiona Tay
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-4480.patch, SOLR-4480.patch


 We are running solr with sunspot and when we set up a query containing a 
 single plus, Solr blows up with the following error:
 SOLR Request (5.0ms)  [ path=#RSolr::Client:0x4c7464ac parameters={data: 
 fq=type%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance+OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=type_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29fq=-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29fq=-%28security_type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29q=%2Bfl=%2A+scoreqf=name_texts+first_name_texts+last_name_texts+file_name_textsdefType=edismaxhl=onhl.simple.pre=%40%40%40hl%40%40%40hl.simple.post=%40%40%40endhl%40%40%40start=0rows=3,
  method: post, params: {:wt=:ruby}, query: wt=ruby, headers: 
 {Content-Type=application/x-www-form-urlencoded; charset=UTF-8}, path: 
 select, uri: http://localhost:8982/solr/select?wt=ruby, open_timeout: , 
 read_timeout: } ]
 RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request
 Error: org.apache.lucene.queryParser.ParseException: Cannot parse '': 
 Encountered EOF at line 1, column 0.
 Was expecting one of:
 NOT ...
 + ...
 - ...
 ( ...
 * ...
 QUOTED ...
 TERM ...
 PREFIXTERM ...
 WILDTERM ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586388#comment-13586388
 ] 

Robert Muir commented on SOLR-3843:
---

Thanks Steve: I was actually (and still am i think) uncertain who should have 
the dependency.

If you think about it, its no different than the analysis module cases: but i 
don't see the webapp depending on them here.

At the moment, i understand the reasoning behind the hard dependency to 
analysis-common.jar (because bogusly the factory stuff is there, imo it should 
not be).

But somewhere in maven, something in solr depends on the other analysis modules 
it bundles (e.g. analyzers-phonetic), yet you could remove this jar and solr 
would work fine (as long as you didnt use these particular phonetic analyzers).

So I feel like these analysis components (except common, see above), along with 
codecs.jar, should be depended on in the same place. I guess theoretically they 
are optional dependencies but I don't think we should do that (unless we test 
every possibility with/without optional X,Y,Z, so I think its a bad idea). But 
they are the same in this sense.

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586412#comment-13586412
 ] 

Steve Rowe commented on SOLR-3843:
--

In the Maven build, it's the solr core module that depends on these analysis 
modules.  Here's the output from {{mvn dependency:tree}} in 
{{maven-build/solr/webapp/}}:

{noformat}
[INFO] --- maven-dependency-plugin:2.4:tree (default-cli) @ solr ---
[INFO] org.apache.solr:solr:war:5.0-SNAPSHOT
[INFO] +- org.apache.solr:solr-core:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-core:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-analyzers-common:jar:5.0-SNAPSHOT:compile
[INFO] |  +- 
org.apache.lucene:lucene-analyzers-kuromoji:jar:5.0-SNAPSHOT:compile
[INFO] |  +- 
org.apache.lucene:lucene-analyzers-morfologik:jar:5.0-SNAPSHOT:compile
[INFO] |  |  \- org.carrot2:morfologik-polish:jar:1.5.5:compile
[INFO] |  | \- org.carrot2:morfologik-stemming:jar:1.5.5:compile
[INFO] |  |\- org.carrot2:morfologik-fsa:jar:1.5.5:compile
[INFO] |  +- 
org.apache.lucene:lucene-analyzers-phonetic:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-highlighter:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-memory:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-misc:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-queryparser:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-spatial:jar:5.0-SNAPSHOT:compile
[INFO] |  |  \- com.spatial4j:spatial4j:jar:0.3:compile
[INFO] |  +- org.apache.lucene:lucene-suggest:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-grouping:jar:5.0-SNAPSHOT:compile
[INFO] |  +- org.apache.lucene:lucene-queries:jar:5.0-SNAPSHOT:compile
[INFO] |  +- commons-codec:commons-codec:jar:1.7:compile
[INFO] |  +- commons-cli:commons-cli:jar:1.2:compile
[INFO] |  +- commons-fileupload:commons-fileupload:jar:1.2.1:compile
[INFO] |  +- commons-io:commons-io:jar:2.1:compile
[INFO] |  +- commons-lang:commons-lang:jar:2.6:compile
[INFO] |  +- com.google.guava:guava:jar:13.0.1:compile
[INFO] |  +- org.codehaus.woodstox:wstx-asl:jar:3.2.7:runtime
[INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.2.3:compile
[INFO] |  |  \- org.apache.httpcomponents:httpcore:jar:4.2.2:compile
[INFO] |  \- org.apache.httpcomponents:httpmime:jar:4.2.3:compile
[INFO] +- org.apache.solr:solr-solrj:jar:5.0-SNAPSHOT:compile
[INFO] |  \- org.apache.zookeeper:zookeeper:jar:3.4.5:compile
[INFO] +- org.apache.lucene:lucene-codecs:jar:5.0-SNAPSHOT:compile
[INFO] +- org.eclipse.jetty.orbit:javax.servlet:jar:3.0.0.v201112011016:provided
[INFO] +- org.slf4j:slf4j-jdk14:jar:1.6.4:runtime (scope not updated to compile)
[INFO] +- org.slf4j:jcl-over-slf4j:jar:1.6.4:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.6.4:compile
[INFO] \- junit:junit:jar:4.10:test
{noformat}

This parallels the Ant build: these analyzer jars are included in the 
solr.lucene.libs path, which is included in solr.base.classpath.

I put the lucene-codecs dependency on the solr webapp module rather than the 
solr core module because *all non-test compilation succeeds without 
lucene-codecs*.  (The lucene-test-framework pulls lucene-codecs into all Solr 
test classpaths.)  And this issue is about packaging of the war: adding the 
dependency to the webapp module fixes exactly the problem.

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586415#comment-13586415
 ] 

Robert Muir commented on SOLR-3843:
---

{quote}
I put the lucene-codecs dependency on the solr webapp module rather than the 
solr core module because all non-test compilation succeeds without 
lucene-codecs. (The lucene-test-framework pulls lucene-codecs into all Solr 
test classpaths.) And this issue is about packaging of the war: adding the 
dependency to the webapp module fixes exactly the problem.
{quote}

But it would also succeed without analyzers-phonetic. How are they any 
different?

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586424#comment-13586424
 ] 

Steve Rowe commented on SOLR-3843:
--

bq. But it would also succeed without analyzers-phonetic. How are they any 
different?

They're not. :)

I think the Ant build should change here: the solr compilation classpath 
shouldn't have things on it that aren't required for compilation.  (This goes 
for the analysis module dependencies in the Maven build too, of course.)

Is there a place where (optional) runtime dependencies are added to the stuff 
that goes into the war?  I haven't looked at this in a while.

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586439#comment-13586439
 ] 

Robert Muir commented on SOLR-3843:
---

I dont think the ant build makes any distinction here. 

But yeah there is probably bigger issue / better way to go about it, someting 
like:
* solr core etc should only have the minimal dependencies
* tests using the solr example should somehow be in webapp/test or something.
* webapp depends on these modules like phonetic and codecs.
* the fact that lucene-test-framework brings in codecs anyway is an impl detail

I guess for now I was just looking at us doing things consistently. Even if we 
are consistently wrong :)

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-3843:
-

Attachment: SOLR-3843.patch

bq. I guess for now I was just looking at us doing things consistently. Even if 
we are consistently wrong :)

Right, makes sense - in this case the consistent thing to do is to make the 
solr-core module, rather than the webapp module, depend in lucene-codecs jar in 
the Maven build.  The attached patch does this.

 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-3843) Add lucene-codecs to Solr libs?

2013-02-25 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586458#comment-13586458
 ] 

Steve Rowe edited comment on SOLR-3843 at 2/25/13 11:26 PM:


bq. I guess for now I was just looking at us doing things consistently. Even if 
we are consistently wrong :)

Right, makes sense - in this case the consistent thing to do is to make the 
solr-core module, rather than the webapp module, depend on the lucene-codecs 
jar in the Maven build.  The attached patch does this.

  was (Author: steve_rowe):
bq. I guess for now I was just looking at us doing things consistently. 
Even if we are consistently wrong :)

Right, makes sense - in this case the consistent thing to do is to make the 
solr-core module, rather than the webapp module, depend in lucene-codecs jar in 
the Maven build.  The attached patch does this.
  
 Add lucene-codecs to Solr libs?
 ---

 Key: SOLR-3843
 URL: https://issues.apache.org/jira/browse/SOLR-3843
 Project: Solr
  Issue Type: Wish
Affects Versions: 4.0
Reporter: Adrien Grand
Priority: Critical
 Fix For: 4.2, 5.0

 Attachments: SOLR-3843.patch, SOLR-3843.patch, SOLR-3843.patch


 Solr gives the ability to its users to select the postings format to use on a 
 per-field basis but only Lucene40PostingsFormat is available by default 
 (unless users add lucene-codecs to the Solr lib directory). Maybe we should 
 add lucene-codecs to Solr libs (I mean in the WAR file) so that people can 
 try our non-default postings formats with minimum effort?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4796) NamedSPILoader.reload needs to be synchronized

2013-02-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586481#comment-13586481
 ] 

Hoss Man commented on LUCENE-4796:
--

+1 ... looks good to me.

 NamedSPILoader.reload needs to be synchronized
 --

 Key: LUCENE-4796
 URL: https://issues.apache.org/jira/browse/LUCENE-4796
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Uwe Schindler
 Fix For: 4.2, 5.0

 Attachments: LUCENE-4796.patch, LUCENE-4796.patch


 Spun off of SOLR-4373: as discsused with uwe on IRC, NamedSPILoader.reload is 
 not thread safe: it reads from this.services at the beginging of hte method, 
 makes additions based on the method input, and then overwrites this.services 
 at the end of the method.  if the method is called by two threads 
 concurrently, the entries added by threadB could be lost if threadA enters 
 the method before threadB and exists the method after threadB

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-4373) In multicore, lib directives in solrconfig.xml cause conflict and clobber directives from earlier cores

2013-02-25 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4373:
---

Description: 
Having lib directives in the solrconfig.xml files of multiple cores can cause 
problems when using multi-threaded core initialization -- which is the default 
starting with Solr 4.1.

The problem manifests itself as init errors in the logs related to not being 
able to find classes located in plugin jars, even though earlier log messages 
indicated that those jars had been added to the classpath.

One work around is to set {{coreLoadThreads=1}} in your solr.xml file -- 
forcing single threaded core initialization.  For example...

{code}
?xml version=1.0 encoding=utf-8 ?
solr coreLoadThreads=1
  cores adminPath=/admin/cores
core name=core1 instanceDir=core1 /
core name=core2 instanceDir=core2 /
  /cores
/solr
{code}

(Similar problems may occur if multiple cores are initialized concurrently 
using the /admin/cores handler)

  was:
Having lib directives in solrconfig.xml seem to wipe out/override the 
definitions in previous cores.

The exception (for the earlier core) is:
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113)
at 
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1000)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.solr.common.SolrException: Plugin init failure for 
[schema.xml] analyzer/filter: Error loading class 'solr.ICUFoldingFilterFactory'
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
at 
org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43)
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151)

The full replication case is attached.

If the SECOND core is turned off in solr.xml, the FIRST core loads just fine.



bq. My bad. Does it mean we know where the problem is?

it means a few things...

1) it confirms the problem you were seeing realted to multi-thread core 
reloading
2) it means we're seeing consistent results, which narrows down the possible 
causes (before it looked like there may be two causes of a single symptom: one 
related to multi-threaded loading that i could reproduce, and one unknown that 
you could reproduce, now it seems more likeley that they are the same
3) it means we have a config based work arround for users who encounter this 
problem w/o requiring a code patch.

Can you try out the patch Uwe posted to LUCENE-4796?  his comments there helped 
me realize why my earlier attempts at fixing this bug didn't work for me, and 
with his most recent patch i can't reproduce this problem.  would be good to 
have your feedback.

 In multicore, lib directives in solrconfig.xml cause conflict and clobber 
 directives from earlier cores
 ---

 Key: SOLR-4373
 URL: https://issues.apache.org/jira/browse/SOLR-4373
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.1
Reporter: Alexandre Rafalovitch
Priority: Blocker
  Labels: lib, multicore
 Fix For: 4.2, 5.0, 4.1.1

 Attachments: multicore-bug.zip


 Having lib directives in the solrconfig.xml files of multiple cores can cause 
 problems when using multi-threaded core initialization -- which is the 
 default starting with Solr 4.1.
 The problem manifests itself as init errors in the logs related to not being 
 able to find classes located in plugin jars, even though earlier log messages 
 indicated that those jars had 

[jira] [Resolved] (LUCENE-4748) Add DrillSideways helper class to Lucene facets module

2013-02-25 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-4748.


Resolution: Fixed

 Add DrillSideways helper class to Lucene facets module
 --

 Key: LUCENE-4748
 URL: https://issues.apache.org/jira/browse/LUCENE-4748
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: DrillSideways-alternative.tar.gz, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch


 This came out of a discussion on the java-user list with subject
 Faceted search in OR: http://markmail.org/thread/jmnq6z2x7ayzci5k
 The basic idea is to count near misses during collection, ie
 documents that matched the main query and also all except one of the
 drill down filters.
 Drill sideways makes for a very nice faceted search UI because you
 don't lose the facet counts after drilling in.  Eg maybe you do a
 search for cameras, and you see facets for the manufacturer, so you
 drill into Nikon.
 With drill sideways, even after drilling down, you'll still get the
 counts for all the other brands, where each count tells you how many
 hits you'd get if you changed to a different manufacturer.
 This becomes more fun if you add further drill-downs, eg maybe I next drill
 down into Resolution=10 megapixels, and then I can see how many 10
 megapixel cameras all other manufacturers, and what other resolutions
 Nikon cameras offer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4748) Add DrillSideways helper class to Lucene facets module

2013-02-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586496#comment-13586496
 ] 

Commit Tag Bot commented on LUCENE-4748:


[trunk commit] Michael McCandless
http://svn.apache.org/viewvc?view=revisionrevision=1449972

LUCENE-4748: add DrillSideways utility class to facets module


 Add DrillSideways helper class to Lucene facets module
 --

 Key: LUCENE-4748
 URL: https://issues.apache.org/jira/browse/LUCENE-4748
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: DrillSideways-alternative.tar.gz, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch


 This came out of a discussion on the java-user list with subject
 Faceted search in OR: http://markmail.org/thread/jmnq6z2x7ayzci5k
 The basic idea is to count near misses during collection, ie
 documents that matched the main query and also all except one of the
 drill down filters.
 Drill sideways makes for a very nice faceted search UI because you
 don't lose the facet counts after drilling in.  Eg maybe you do a
 search for cameras, and you see facets for the manufacturer, so you
 drill into Nikon.
 With drill sideways, even after drilling down, you'll still get the
 counts for all the other brands, where each count tells you how many
 hits you'd get if you changed to a different manufacturer.
 This becomes more fun if you add further drill-downs, eg maybe I next drill
 down into Resolution=10 megapixels, and then I can see how many 10
 megapixel cameras all other manufacturers, and what other resolutions
 Nikon cameras offer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-4504) CurrencyField treats docs w/o value the same as having a value of 0.0

2013-02-25 Thread Hoss Man (JIRA)
Hoss Man created SOLR-4504:
--

 Summary: CurrencyField treats docs w/o value the same as having a 
value of 0.0
 Key: SOLR-4504
 URL: https://issues.apache.org/jira/browse/SOLR-4504
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man


As noted by Gerald Blank on the mailing list, CurrencyField queries treat 
documents w/o any value the same as documents wit ha value of 0.0f.

observe that using the example solr schema, with any number of docs indexed, 
this query matches all docs even though no docs have any values at all for hte 
specified field...

{noformat}
http://localhost:8983/solr/select?q=hoss_c:[*%20TO%20*]
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4748) Add DrillSideways helper class to Lucene facets module

2013-02-25 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586508#comment-13586508
 ] 

Commit Tag Bot commented on LUCENE-4748:


[branch_4x commit] Michael McCandless
http://svn.apache.org/viewvc?view=revisionrevision=1449973

LUCENE-4748: add DrillSideways utility class to facets module


 Add DrillSideways helper class to Lucene facets module
 --

 Key: LUCENE-4748
 URL: https://issues.apache.org/jira/browse/LUCENE-4748
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.2, 5.0

 Attachments: DrillSideways-alternative.tar.gz, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, LUCENE-4748.patch, 
 LUCENE-4748.patch, LUCENE-4748.patch


 This came out of a discussion on the java-user list with subject
 Faceted search in OR: http://markmail.org/thread/jmnq6z2x7ayzci5k
 The basic idea is to count near misses during collection, ie
 documents that matched the main query and also all except one of the
 drill down filters.
 Drill sideways makes for a very nice faceted search UI because you
 don't lose the facet counts after drilling in.  Eg maybe you do a
 search for cameras, and you see facets for the manufacturer, so you
 drill into Nikon.
 With drill sideways, even after drilling down, you'll still get the
 counts for all the other brands, where each count tells you how many
 hits you'd get if you changed to a different manufacturer.
 This becomes more fun if you add further drill-downs, eg maybe I next drill
 down into Resolution=10 megapixels, and then I can see how many 10
 megapixel cameras all other manufacturers, and what other resolutions
 Nikon cameras offer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4414) MoreLikeThis on a shard finds no interesting terms if the document queried is not in that shard

2013-02-25 Thread Colin Bartolome (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586558#comment-13586558
 ] 

Colin Bartolome commented on SOLR-4414:
---

By the way, I'm guessing the interesting terms that the query does return, when 
it returns any, are based on the documents contained in that shard only, 
instead of the documents contained in the whole collection. I suppose I can 
live with that, for the time being, but the trick is to query the right shard 
to begin with!

 MoreLikeThis on a shard finds no interesting terms if the document queried is 
 not in that shard
 ---

 Key: SOLR-4414
 URL: https://issues.apache.org/jira/browse/SOLR-4414
 Project: Solr
  Issue Type: Bug
  Components: MoreLikeThis, SolrCloud
Affects Versions: 4.1
Reporter: Colin Bartolome

 Running a MoreLikeThis query in a cloud works only when the document being 
 queried exists in whatever shard serves the request. If the document is not 
 present in the shard, no interesting terms are found and, consequently, no 
 matches are found.
 h5. Steps to reproduce
 * Edit example/solr/collection1/conf/solrconfig.xml and add this line, with 
 the rest of the request handlers:
 {code:xml}
 requestHandler name=/mlt class=solr.MoreLikeThisHandler /
 {code}
 * Follow the [simplest SolrCloud 
 example|http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster]
  to get two shards running.
 * Hit this URL: 
 [http://localhost:8983/solr/collection1/mlt?mlt.fl=includesq=id:3007WFPmlt.match.include=falsemlt.interestingTerms=listmlt.mindf=1mlt.mintf=1]
 * Compare that output to that of this URL: 
 [http://localhost:7574/solr/collection1/mlt?mlt.fl=includesq=id:3007WFPmlt.match.include=falsemlt.interestingTerms=listmlt.mindf=1mlt.mintf=1]
 The former URL will return a result and list some interesting terms. The 
 latter URL will return no results and list no interesting terms. It will also 
 show this odd XML element:
 {code:xml}
 null name=response/
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Distributed MLT doesn't seem to work (SOLR-4414?)

2013-02-25 Thread Shawn Heisey
I can't get distributed MLT (committed in Solr 4.1, using 4.2-SNAPSHOT) 
to work at all.  I think whatever is causing SOLR-4414 is probably 
causing my issue as well.  If there's anything specific that's required 
to troubleshoot this issue, let me know how to get it and I'll provide it.


Thanks,
Shawn

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >