date:20130312


 [ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheer Prem updated SOLR-4561:
---

Description: 
When child entities are created and the child entity is provided with a 
parametrized query as below, 
{code:xml} 
entity name=x query=select * from x
entity name=y query=select * from y where xid=${x.id} 
processor=CachedSqlEntityProcessor
/entity
entity
{code} 

the Entity Processor always return the result from the fist query even though 
the parameter is changed, It is happening because, 
EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
after calling DIHCacheSupport.getCacheData() method.

This can be fixed by changing the else block in getNext() method of 
EntityProcessorBase from

{code} 
else  {
  return cacheSupport.getCacheData(context, query, rowIterator);
  
}
{code} 

to the code mentioned below:

{code} 
else  {
  MapString,Object cacheData = cacheSupport.getCacheData(context, query, 
rowIterator);
  query = null;
  rowIterator = null;
  return cacheData;
}
{code}   

Update: But then, the caching doesn't seem to be working...

  was:
When child entities are created and the child entity is provided with a 
parametrized query as below, 
{code:xml} 
entity name=x query=select * from x
entity name=y query=select * from y where xid=${x.id} 
processor=CachedSqlEntityProcessor
/entity
entity
{code} 

the Entity Processor always return the result from the fist query even though 
the parameter is changed, It is happening because, 
EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
after calling DIHCacheSupport.getCacheData() method.

This can be fixed by changing the else block in getNext() method of 
EntityProcessorBase from

{code} 
else  {
  return cacheSupport.getCacheData(context, query, rowIterator);
  
}
{code} 

to the code mentioned below:

{code} 
else  {
  MapString,Object cacheData = cacheSupport.getCacheData(context, query, 
rowIterator);
  query = null;
  rowIterator = null;
  return cacheData;
}
{code}   



 CachedSqlEntityProcessor with parametarized query is broken
 ---

 Key: SOLR-4561
 URL: https://issues.apache.org/jira/browse/SOLR-4561
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Sudheer Prem
   Original Estimate: 1m
  Remaining Estimate: 1m

 When child entities are created and the child entity is provided with a 
 parametrized query as below, 
 {code:xml} 
 entity name=x query=select * from x
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 /entity
 entity
 {code} 
 the Entity Processor always return the result from the fist query even though 
 the parameter is changed, It is happening because, 
 EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
 after calling DIHCacheSupport.getCacheData() method.
 This can be fixed by changing the else block in getNext() method of 
 EntityProcessorBase from
 {code} 
 else  {
   return cacheSupport.getCacheData(context, query, rowIterator);
   
 }
 {code} 
 to the code mentioned below:
 {code} 
 else  {
   MapString,Object cacheData = cacheSupport.getCacheData(context, 
 query, rowIterator);
   query = null;
   rowIterator = null;
   return cacheData;
 }
 {code}   
 Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-03-12 Thread Shai Erera (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599779#comment-13599779
]

Shai Erera commented on LUCENE-4795:

bq. Well the taxonomy index doesn't give you global ordinals. it gives you
global termIDs, which are unique integers: but they aren't ordinals

That's right. I am not familiar with how Solr utilizes that, but I agree with
your statement. The term ordinal was derived from the fact that the taxonomy
does preserve order between parent/children. I.e. Date Date/2010
Date/2011. So Date will always have a lower ordinal than its children, but
there is not meaningful order between siblings.

bq. Its also unclear to me how the taxonomy index would really integrate in a
distributed system like solr or elasticsearch.

Why? We work with the taxonomy index in two modes in a distributed environment:

# Every shard maintains its own taxonomy index and facets are merged by their
label. That's basically what Solr/ES/SortedSet would do right?
# In a specific project we run, where every document goes through a MapReduce
analysis (no NRT!), we maintain a truly global taxonomy index, where ordinal=17
means the same category in all shards. The taxonomy index itself is replicated
to all shards. There are tradeoffs of course, but you cannot do that with
SortedSet right? The advantage is that you can do the merge by the ordinal
(integer ID), rather than the label.

bq. I personally don't think its the end of the world if Mike's patch doesnt
support all the features of the faceting module initially or even ever.

+1, I don't criticize that approach negatively. I personally don't understand
why the sidecar taxonomy index freaks the hell out of people, but I don't mind
if there are multiple facet implementations. I can share with you that we used
to have few implementations too, before we converged to one (and then
contributed to Lucene).

You didn't answer my question though, and perhaps it doesn't belong in this
issue, but is there a way to utilize the ordinal given to a DV value somehow?
Or is it internal to the SortedSet DV?

Mike, should you also check in SortedSetDocValuesAccumulator that FR.getDepth()
== 1? I don't think that you support counting up to depth N, right?

Add FacetsCollector based on SortedSetDocValues
---

Key: LUCENE-4795
URL: https://issues.apache.org/jira/browse/LUCENE-4795
Project: Lucene - Core
Issue Type: Improvement
Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch,
LUCENE-4795.patch, pleaseBenchmarkMe.patch

Recently (LUCENE-4765) we added multi-valued DocValues field
(SortedSetDocValuesField), and this can be used for faceting in Solr
(SOLR-4490). I think we should also add support in the facet module?
It'd be an option with different tradeoffs. Eg, it wouldn't require
the taxonomy index, since the main index handles label/ord resolving.
There are at least two possible approaches:
* On every reopen, build the seg - global ord map, and then on
every collect, get the seg ord, map it to the global ord space,
and increment counts. This adds cost during reopen in proportion
to number of unique terms ...
* On every collect, increment counts based on the seg ords, and then
do a merge in the end just like distributed faceting does.
The first approach is much easier so I built a quick prototype using
that. The prototype does the counting, but it does NOT do the top K
facets gathering in the end, and it doesn't know parent/child ord
relationships, so there's tons more to do before this is real. I also
was unsure how to properly integrate it since the existing classes
seem to expect that you use a taxonomy index to resolve ords.
I ran a quick performance test. base = trunk except I disabled the
compute top-K in FacetsAccumulator to make the comparison fair; comp
= using the prototype collector in the patch:
{noformat}
TaskQPS base StdDevQPS comp StdDev
Pct diff
OrHighLow 18.79 (2.5%) 14.36 (3.3%)
-23.6% ( -28% - -18%)
HighTerm 21.58 (2.4%) 16.53 (3.7%)
-23.4% ( -28% - -17%)
OrHighMed 18.20 (2.5%) 13.99 (3.3%)
-23.2% ( -28% - -17%)
Prefix3 14.37 (1.5%) 11.62 (3.5%)
-19.1% ( -23% - -14%)
LowTerm 130.80 (1.6%) 106.95 (2.4%)
-18.2% ( -21% - -14%)
OrHighHigh9.60 (2.6%)7.88 (3.5%)
-17.9% ( -23% - -12%)
AndHighHigh

[jira] [Updated] (SOLR-3755) shard splitting

2013-03-12 Thread Anshum Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-3755:
---

Attachment: SOLR-3755-combinedWithReplication.patch

Added replica creation to the earlier 'combined' patch that Shalin had put up. 
This is yet to be tested as we're yet to fix the 2nd core creation issue.

 shard splitting
 ---

 Key: SOLR-3755
 URL: https://issues.apache.org/jira/browse/SOLR-3755
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Yonik Seeley
 Attachments: SOLR-3755-combined.patch, 
 SOLR-3755-combinedWithReplication.patch, SOLR-3755-CoreAdmin.patch, 
 SOLR-3755.patch, SOLR-3755.patch, SOLR-3755-testSplitter.patch, 
 SOLR-3755-testSplitter.patch


 We can currently easily add replicas to handle increases in query volume, but 
 we should also add a way to add additional shards dynamically by splitting 
 existing shards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit)

2013-03-12 Thread Simon Willnauer

Uwe you rock! Beside my morning entertainment this was an awesome job!

simon

On Tue, Mar 12, 2013 at 8:31 AM, Tommaso Teofili
tommaso.teof...@gmail.com wrote:
 thanks Uwe!


 2013/3/12 Robert Muir rcm...@gmail.com

 Uwe: Thanks for working with them to get all these issues fixed.

 On Mon, Mar 11, 2013 at 7:34 PM, Uwe Schindler u...@thetaphi.de wrote:
  Hi,
 
  FYI, Oracle has a fix for the G1GC hang in UIMA waiting for review:
 
  Issue: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8009536
  Webrev:
  http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-March/006215.html
  Patch: http://cr.openjdk.java.net/~johnc/8009536/webrev.0/
 
  Thanks to John Cuthbertson and Bengt Rutisson @ Oracle for fixing so
  fast! We just have to wait for a new JDK8 build with that fix included (and
  some more for the other Lucene-related bugs).
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Mark Miller [mailto:markrmil...@gmail.com]
  Sent: Wednesday, March 06, 2013 7:52 PM
  To: dev@lucene.apache.org
  Subject: Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32
  bit)
 
  Awesome work Uwe! Nice job getting this some attention.
 
  - mark
 
  On Mar 6, 2013, at 10:41 AM, Uwe Schindler u...@thetaphi.de wrote:
 
   It seems that there is already an explanation from the Oracle
   engineer:
  
   -Original Message-
   From: John Cuthbertson [mailto:john.cuthbert...@oracle.com]
   Sent: Wednesday, March 06, 2013 7:04 PM
   To: Thomas Schatzl
   Cc: Uwe Schindler; hotspot-gc-...@openjdk.java.net; 'David Holmes';
   'Dawid Weiss'; hotspot-...@openjdk.java.net
   Subject: Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux
   32
   bit)
  
   Hi Everyone,
  
   All:
   I've looked at the bug report (haven't tried to reproduce it yet)
   and
   Bengt's analysis is correct. The concurrent mark thread is entering
   the synchronization protocol in a marking step call. That code is
   waiting for some non-existent workers to terminate before
   proceeding.
   Normally we shouldn't be entering that code but I think we
   overflowed
   the global marking stack (I updated the CR at ~1am my time with that
   conjecture). I think I missed a set_phase() call to tell the
   parallel
   terminator that we only have one thread and it's picking up the
   number of workers that executed the remark parallel task.
  
   Thomas: you were on the right track with your comment about the
   marking stack size.
  
   David:
   Thanks for helping out here. The stack trace you mentioned was for
   one the refinement threads - a concurrent GC thread. When a
   concurrent GC thread joins the suspendible thread set, it means
   that it will observe and participate in safepoint operations, i.e.
   the thread will notice that it should reach a safepoint and the
   safepoint
  synchronizer code will wait for it to block.
   When we wish a concurrent GC thread to not observe safepoints, that
   thread leaves the suspendible thread set. I think the name could be
   a
   bit better and Tony, before he left, had a change that used a scoped
   object to join and leave the STS that hasn't been integrated yet.
   IIRC Tony wasn't happy with the name he chose for that also.
  
   Uwe:
   Thanks for bringing this up and my apologies for not replying
   sooner.
   I will have a fix fairly soon. If I'm correct about it being caused
   by overflowing the marking stack you can work around the issue by
   increasing the MarkStackSize.you could try increasing it to 2M or 4M
   entries (which is the current max size).
  
   Cheers,
  
   JohnC
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen
   http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
   -Original Message-
   From: Uwe Schindler [mailto:u...@thetaphi.de]
   Sent: Wednesday, March 06, 2013 1:35 PM
   To: dev@lucene.apache.org
   Subject: FW: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux
   32
   bit)
  
   They already understood the G1GC problem with JDK 8 b78/b79 and
   working on a fix. This was really fast:
   http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-
   March/006128.html
  
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen
   http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
  
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
   additional commands, e-mail: dev-h...@lucene.apache.org
  
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional
  commands, e-mail: dev-h...@lucene.apache.org
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599837#comment-13599837
 ] 

Uwe Schindler commented on LUCENE-4713:
---

Use Codecs.reloadCodecs(antClassLoader) in your application initialization 
code. The same method exists for PostingsFormats, Unfortunately there is no 
method to automatically reload all SPIs in Lucene.


 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Priority: Minor
  Labels: ClassLoader, Thread
 Attachments: LUCENE-4713.patch, LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails

[
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599842#comment-13599842
]

Uwe Schindler commented on LUCENE-4713:
---

From my persepctive, Christian Kohlschütters suggestion is nice to have. We
should at least enforce that the classloader that loaded the lucene-core.jar
file is also scanned, regardless what the context class loader is - this would
somehow emulate what the JDK does wth its own extensions like XML parsers.
In any case, we would need to decide, what to do first (the Lucene class
loader or the context one).

I will provide a patch.

SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader
fails

Key: LUCENE-4713
URL: https://issues.apache.org/jira/browse/LUCENE-4713
Project: Lucene - Core
Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Priority: Minor
Labels: ClassLoader, Thread
Attachments: LUCENE-4713.patch, LuceneContextClassLoader.patch

NOTE: This issue has been renamed from:
Replace calls to Thread#getContextClassLoader with the ClassLoader of the
current class
because the revised patch provides a clean fallback path.
I am not sure whether it is a design decision or if we can indeed consider
this a bug:
In core and analysis-common some classes provide on-demand class loading
using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and
AnalysisSPILoader there are constructors that use the Thread's context
ClassLoader by default whenever no particular other ClassLoader was specified.
Unfortunately this does not work as expected when the Thread's ClassLoader
can't see the required classes that are instantiated downstream with the help
of Class.forName (e.g., Codecs, Analyzers, etc.).
That's what happened to us here. We currently experiment with running Lucene
2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each
seeing only the corresponding Lucene version and the upstream classpath.
While NamedSPILoader and company get successfully loaded by our custom
ClassLoader, their instantiation fails because our Thread's
Context-ClassLoader cannot find the additionally required classes.
We could probably work-around this by using Thread#setContextClassLoader at
construction time (and quickly reverting back afterwards), but I have the
impression this might just hide the actual problem and cause further trouble
when lazy-loading classes later on, and potentially from another Thread.
Removing the call to Thread#getContextClassLoader would also align with the
behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses
Attribute#getClass().getClassLoader() instead.
A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails

[
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-4713:
--

Attachment: LUCENE-4713.patch

This is the easiest patch possible. Still lacks some documentation (to actually
document that the Lucene class loader is scanned), but ensures that at least
all SPIs shipped with Lucene are visible.

If a user has additional SPIs outside Lucene core, then its his turn to make
them correctly available.

The Lucene classloader is scanned before the core one, because the classes
shipped with lucene should take precedence. On the other hand, this makes it
impossible to override Lucene's default codec unless you place the jar file
next to lucene-core.jar in same classloader.

SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader
fails

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4713:
--

Attachment: LUCENE-4713.patch

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails

[
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599858#comment-13599858
]

Uwe Schindler edited comment on LUCENE-4713 at 3/12/13 9:30 AM:

This is the easiest patch possible. Still lacks some documentation (to actually
document that the Lucene class loader is scanned), but ensures that at least
all SPIs shipped with Lucene are visible.

If a user has additional SPIs outside Lucene core, then its his turn to make
them correctly available.

The Lucene classloader is scanned before the context one, because the classes
shipped with lucene should take precedence. On the other hand, this makes it
impossible to override Lucene's default codec unless you place the jar file
next to lucene-core.jar in same classloader.

was (Author: thetaphi):
This is the easiest patch possible. Still lacks some documentation (to
actually document that the Lucene class loader is scanned), but ensures that at
least all SPIs shipped with Lucene are visible.

If a user has additional SPIs outside Lucene core, then its his turn to make
them correctly available.

SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader
fails

Attachments: LUCENE-4713.patch, LUCENE-4713.patch,
LuceneContextClassLoader.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4713:
--

Attachment: (was: LUCENE-4713.patch)

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4713:
--

Fix Version/s: 4.3

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken

2013-03-12 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599868#comment-13599868
 ] 

Ahmet Arslan commented on SOLR-4561:


I ran into this bug too.

 CachedSqlEntityProcessor with parametarized query is broken
 ---

 Key: SOLR-4561
 URL: https://issues.apache.org/jira/browse/SOLR-4561
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Sudheer Prem
   Original Estimate: 1m
  Remaining Estimate: 1m

 When child entities are created and the child entity is provided with a 
 parametrized query as below, 
 {code:xml} 
 entity name=x query=select * from x
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 /entity
 entity
 {code} 
 the Entity Processor always return the result from the fist query even though 
 the parameter is changed, It is happening because, 
 EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
 after calling DIHCacheSupport.getCacheData() method.
 This can be fixed by changing the else block in getNext() method of 
 EntityProcessorBase from
 {code} 
 else  {
   return cacheSupport.getCacheData(context, query, rowIterator);
   
 }
 {code} 
 to the code mentioned below:
 {code} 
 else  {
   MapString,Object cacheData = cacheSupport.getCacheData(context, 
 query, rowIterator);
   query = null;
   rowIterator = null;
   return cacheData;
 }
 {code}   
 Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken

2013-03-12 Thread Ahmet Arslan (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599872#comment-13599872
 ] 

Ahmet Arslan commented on SOLR-4561:


It seems that it was reported before by James in SOLR-3857

 CachedSqlEntityProcessor with parametarized query is broken
 ---

 Key: SOLR-4561
 URL: https://issues.apache.org/jira/browse/SOLR-4561
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Sudheer Prem
   Original Estimate: 1m
  Remaining Estimate: 1m

 When child entities are created and the child entity is provided with a 
 parametrized query as below, 
 {code:xml} 
 entity name=x query=select * from x
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 /entity
 entity
 {code} 
 the Entity Processor always return the result from the fist query even though 
 the parameter is changed, It is happening because, 
 EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
 after calling DIHCacheSupport.getCacheData() method.
 This can be fixed by changing the else block in getNext() method of 
 EntityProcessorBase from
 {code} 
 else  {
   return cacheSupport.getCacheData(context, query, rowIterator);
   
 }
 {code} 
 to the code mentioned below:
 {code} 
 else  {
   MapString,Object cacheData = cacheSupport.getCacheData(context, 
 query, rowIterator);
   query = null;
   rowIterator = null;
   return cacheData;
 }
 {code}   
 Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599885#comment-13599885
 ] 

Christian Kohlschütter commented on LUCENE-4713:


Thanks, Uwe! Looks good and works well in our setup.

Regarding overriding Lucene's default codec implementations:
We anyways have to place any other modified, non-SPI Lucene classes in the same 
ClassLoader, so I really appreciate that this patch enforces this.


 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails

[
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-4713:
--

Attachment: LUCENE-4713.patch

Hi Christian,
another patch, with some optimization. The clazz's classloader is only scanned,
if its not a parent or the same. If the Lucene's clazz' classloader is a parent
of the context one, it does not need to scan it.
This also works around the problems with hiding classes. To override the
Lucene core codecs, e.g. Tomcat's classloader (J2EE) will use parent-last
semantics, and in that case the precedence goes to the webapp.
Only if the lucene classloader is not at all related to the context one, it is
scanned.

SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader
fails

Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch,
LuceneContextClassLoader.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails

[
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599891#comment-13599891
]

Uwe Schindler edited comment on LUCENE-4713 at 3/12/13 10:22 AM:
-

Can you try this, too? Unfortunately its hard to write a good testcase without
some fake classes in separate compilation units which complicates the Lucene
build :-)

was (Author: thetaphi):
Hi Christian,
another patch, with some optimization. The clazz's classloader is only scanned,
if its not a parent or the same. If the Lucene's clazz' classloader is a parent
of the context one, it does not need to scan it.
This also works around the problems with hiding classes. To override the
Lucene core codecs, e.g. Tomcat's classloader (J2EE) will use parent-last
semantics, and in that case the precedence goes to the webapp.
Only if the lucene classloader is not at all related to the context one, it is
scanned.

SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader
fails

Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch,
LuceneContextClassLoader.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599912#comment-13599912
 ] 

Uwe Schindler commented on LUCENE-4713:
---

bq. Regarding overriding Lucene's default codec implementations: We anyways 
have to place any other modified, non-SPI Lucene classes in the same 
ClassLoader, so I really appreciate that this patch enforces this.

Overriding default Lucene Codecs doesn't need to necessarily use the same class 
name. Codecs are identified by their name as written into the index files 
(e.g., Lucene42). If you implement another subclass of Codec with the same 
name, but different class name, it is also taken into account. But in any case, 
the class file must be listed before the lucene-core.jar one in classpath (btw, 
this is used in Lucene 4.x, to allow a READ/WRITE variant of the Lucene3x codec 
for testing only. The test-framework.jar simply exposes another class, 
extending the original READONLY Lucene3x codec to support WRITE, but makeing it 
available also with the Lucene3x name to the loader).

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4412) LanguageIdentifier lcmap for language field


 [ 
https://issues.apache.org/jira/browse/SOLR-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4412:
--

Attachment: SOLR-4412.patch

First patch (git diff format)

 LanguageIdentifier lcmap for language field
 ---

 Key: SOLR-4412
 URL: https://issues.apache.org/jira/browse/SOLR-4412
 Project: Solr
  Issue Type: Bug
  Components: contrib - LangId
Affects Versions: 4.1
Reporter: Jan Høydahl
 Fix For: 4.3

 Attachments: SOLR-4412.patch


 For some languages, the detector will detect sub-languages, such as 
 LangDetect detecting zh-tw or zh-cn for Chinese. Tika detector only detects 
 zh. Today you can use {{lcmap}} to map these two into one code, e.g. 
 {{langid.map.lcmap=zh-cn:zh zh-tw:zh}}. But the {{langField}} output is not 
 changed.
 We need an option for {{langField}} as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-4412) LanguageIdentifier lcmap for language field


 [ 
https://issues.apache.org/jira/browse/SOLR-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl reassigned SOLR-4412:
-

Assignee: Jan Høydahl

 LanguageIdentifier lcmap for language field
 ---

 Key: SOLR-4412
 URL: https://issues.apache.org/jira/browse/SOLR-4412
 Project: Solr
  Issue Type: Bug
  Components: contrib - LangId
Affects Versions: 4.1
Reporter: Jan Høydahl
Assignee: Jan Høydahl
 Fix For: 4.3

 Attachments: SOLR-4412.patch


 For some languages, the detector will detect sub-languages, such as 
 LangDetect detecting zh-tw or zh-cn for Chinese. Tika detector only detects 
 zh. Today you can use {{lcmap}} to map these two into one code, e.g. 
 {{langid.map.lcmap=zh-cn:zh zh-tw:zh}}. But the {{langField}} output is not 
 changed.
 We need an option for {{langField}} as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails

[
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599920#comment-13599920
]

Christian Kohlschütter commented on LUCENE-4713:

Works for me, too. Those corner cases...

One thing that I stumbled upon was that Thread#getContextClassLoader may
actually return null.
We currently throw an IllegalArgumentException in this case, which can be
considered a bug by itself.

If we decide that a fix for this bug is to check for null and use the classes'
default ClassLoader instead, we would actually call #reload twice (because
isParentClassLoader will return false if child==null).

See the attached patch for a proposed fix.

SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader
fails

Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch,
LUCENE-4713.patch, LuceneContextClassLoader.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-4713:
-

Assignee: Uwe Schindler

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Kohlschütter updated LUCENE-4713:
---

Attachment: LUCENE-4713.patch

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: initVM segmentation fault

2013-03-12 Thread Andi Vajda


On Mar 12, 2013, at 2:51, Jeune Asuncion je...@bright.com wrote:

 Dear *Pylucene*,
 
 I am trying to run pylucene on my Fedora box but I get a segmentation fault
 when I do so. I was able to trace the cause of this error to initVM().
 
 In the Python interpreter when I execute the lines of code below I get the
 segmentation fault:
 
 import lucene
 lucene.initVM()
 Segmentation fault
 
 I thought this was because jcc isn't installed because I have pylucene
 installed on another box and it returns a jcc object. However, I have jcc
 installed as well on the box where lucene.initVM() isn't working:
 
 import jcc
 jcc.initVM()
 jcc.JCCEnv object at 0x7f7162e12138
 
 Would like to get some pointers as to why this is happening.

Did you build PyLucene and JCC on this box ?

Andi..


 
 Thanks,
 
 Jeune

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Kohlschütter updated LUCENE-4713:
---

Attachment: LUCENE-4713.patch

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Kohlschütter updated LUCENE-4713:
---

Attachment: (was: LUCENE-4713.patch)

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599927#comment-13599927
 ] 

Uwe Schindler commented on LUCENE-4713:
---

There is another problem: The abstract clazz' classloader may be null, too 
(although this never happens in recent JDKs): The bootstrap class loader may be 
null. But we don't have the problem here, as Lucene classes are never ever 
loaded through the boot class loader (but e.g. String.class.getClassLoader() 
may return null).

I dont like hooking also into reload(), I will think of another more elegant 
solution). But to mention: If the context class loader is null (which cannot 
happen unless you explicitly set it to null), Java's own classloading for SPIs 
ould be broken, too (see the implementation of java.util.ServiceLoader).

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Kohlschütter updated LUCENE-4713:
---

Attachment: LUCENE-4713.patch

This patch keeps #reload untouched.


 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails

[
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-4713:
--

Attachment: LUCENE-4713.patch

Here is the patch that mimics what the original java.util.ServiceLoader does:
If the classloader (e.g. the context classloader) is null, it uses the system
classloader. The exception on null classloader was removed.

The patch then also adds some null checks, so the fallback case is only used
if both possible loaders are != null.
If all class loaders are null, the system loader is used, which should never
happen, as Lucene is not part of rt.jar.

I think this is ready. Unfortunately we had some overlap, Christian :-)

SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader
fails

Key: LUCENE-4713
URL: https://issues.apache.org/jira/browse/LUCENE-4713
Project: Lucene - Core
Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
Labels: ClassLoader, Thread
Fix For: 4.3

Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch,
LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch,
LuceneContextClassLoader.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599927#comment-13599927
 ] 

Uwe Schindler edited comment on LUCENE-4713 at 3/12/13 11:44 AM:
-

There is another problem: The abstract clazz' classloader may be null, too 
(although this never happens in recent JDKs): The bootstrap class loader may be 
null. But we don't have the problem here, as Lucene classes are never ever 
loaded through the boot class loader (but e.g. String.class.getClassLoader() 
may return null).

I dont like hooking also into reload(), I will think of another more elegant 
solution). -But to mention: If the context class loader is null (which cannot 
happen unless you explicitly set it to null), Java's own classloading for SPIs 
would be broken, too (see the implementation of java.util.ServiceLoader).- 
(EDIT: Java's ServiceLoader uses SystemClassLoader if context loader is null)

  was (Author: thetaphi):
There is another problem: The abstract clazz' classloader may be null, too 
(although this never happens in recent JDKs): The bootstrap class loader may be 
null. But we don't have the problem here, as Lucene classes are never ever 
loaded through the boot class loader (but e.g. String.class.getClassLoader() 
may return null).

I dont like hooking also into reload(), I will think of another more elegant 
solution). But to mention: If the context class loader is null (which cannot 
happen unless you explicitly set it to null), Java's own classloading for SPIs 
ould be broken, too (see the implementation of java.util.ServiceLoader).
  
 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues


[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599945#comment-13599945
 ] 

Michael McCandless commented on LUCENE-4795:


bq. Mike, why do you need to initialize a FacetRequest like so: 
requests.add(new CountFacetRequest(new CategoryPath(a, sep), 10));?

Woops, that's just silly: I'll remove the sep there.

bq. Mike, should you also check in SortedSetDocValuesAccumulator that 
FR.getDepth() == 1? I don't think that you support counting up to depth N, 
right?

Right, it only supports flat (dim / label) today ... ok, I'll add that
check.


 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4713:
--

Attachment: LUCENE-4713.patch

Sorry, again a new patch.

Now the case where the context class loader is null is handled correctly.

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Improving DirectSpellChecker

2013-03-12 Thread Robert Muir

On Tue, Mar 12, 2013 at 7:22 AM, Varun Thacker
varunthacker1...@gmail.com wrote:
 I was looking at the results from the spellchecker. So If I have a field
 where the terms get analyzed the results shown are the analyzed form as a
 suggestion. Example, for Battery the spell suggestion if one makes a mistake
 would be batteri.


I don't think you should use such a field for spellchecking, instead
just something very simple like standardtokenizer + lowercase for the
spellcheck field.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599965#comment-13599965
 ] 

Uwe Schindler commented on LUCENE-4713:
---

Just for reference: see line 336+ of 
http://www.docjar.com/html/api/java/util/ServiceLoader.java.html

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4823) Add a separate registration singleton for Lucene's SPI, so there is only one central instance to request rescanning of classpath (e.g. from Solr's ResourceLoader)

Uwe Schindler created LUCENE-4823:
-

 Summary: Add a separate registration singleton for Lucene's SPI, 
so there is only one central instance to request rescanning of classpath (e.g. 
from Solr's ResourceLoader)
 Key: LUCENE-4823
 URL: https://issues.apache.org/jira/browse/LUCENE-4823
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
 Fix For: 5.0, 4.3


Currently there is no easy way to do a global rescan/reload of all of 
Lucene's SPIs in the right order. In solr there is a long list of reload 
instructions in the ResourceLoader. If somebody adds a new SPI type, you have 
to add it there.

It would be good to java a central instance in oal.util that keeps track of all 
NamedSPILoaders and AnalysisSPILoaders (in the order they were instantiated), 
so you have one central entry point to trigger a reload.

This issue will introduce:
- A singleton that makes reloading possible. The singleton keeps weak refs to 
all loaders (of any kind) in the order they were created.
- NamedSPILoader and AnalysisSPILoader (unfortunately we need both instances, 
as they differ in the internals (one keeps classes, the other one instances). 
Both should implement a reloadable interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968
 ] 

Christian Kohlschütter commented on LUCENE-4713:


Overlap and coverage, Uwe :)

One thing is still unclear to me. Given loader is null, in SPIClassIterator 
line 143 we call Class.forName with a null ClassLoader.

However (at least in the Oracle 1.7 JDK) 
Class#forName(String,boolean,ClassLoader) does not use 
ClassLoader#getSystemClassLoader but ClassLoader#getCallerClassLoader instead 
(which IMHO contradicts the JavaDocs description, where they claim to use the 
bootstrap classloader...)

Given that it is very unlikely that we're running into any problems with 
bootstrap resources, I would actually just check for loader==null in 
SPIClassIterator and assign loader=ClassLoader.getSystemClassLoader() in this 
case. This will use the System ClassLoader by default and only falls back to 
getCallerClassLoader if there is no System ClassLoader.

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


 [ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-4713:
--

Attachment: (was: LUCENE-4713.patch)

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968
 ] 

Christian Kohlschütter edited comment on LUCENE-4713 at 3/12/13 12:13 PM:
--

Overlap and coverage, Uwe :)


  was (Author: c...@newsclub.de):
Overlap and coverage, Uwe :)

One thing is still unclear to me. Given loader is null, in SPIClassIterator 
line 143 we call Class.forName with a null ClassLoader.

However (at least in the Oracle 1.7 JDK) 
Class#forName(String,boolean,ClassLoader) does not use 
ClassLoader#getSystemClassLoader but ClassLoader#getCallerClassLoader instead 
(which IMHO contradicts the JavaDocs description, where they claim to use the 
bootstrap classloader...)

Given that it is very unlikely that we're running into any problems with 
bootstrap resources, I would actually just check for loader==null in 
SPIClassIterator and assign loader=ClassLoader.getSystemClassLoader() in this 
case. This will use the System ClassLoader by default and only falls back to 
getCallerClassLoader if there is no System ClassLoader.
  
 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968
 ] 

Christian Kohlschütter edited comment on LUCENE-4713 at 3/12/13 12:13 PM:
--

Overlap and coverage, Uwe :)
Looks good to me!


  was (Author: c...@newsclub.de):
Overlap and coverage, Uwe :)

  
 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4563) RSS DIH-example not working


 [ 
https://issues.apache.org/jira/browse/SOLR-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-4563:
--

Attachment: SOLR-4563.patch

Simple patch

 RSS DIH-example not working
 ---

 Key: SOLR-4563
 URL: https://issues.apache.org/jira/browse/SOLR-4563
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Jan Høydahl
 Fix For: 4.3, 5.0

 Attachments: SOLR-4563.patch


 The xpath paths of /rss/item do not match the real world RSS feed which uses 
 /rss/channel/item

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4823) Add a separate registration singleton for Lucene's SPI, so there is only one central instance to request rescanning of classpath (e.g. from Solr's ResourceLoader)

[
https://issues.apache.org/jira/browse/LUCENE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-4823:
--

Description:
Currently there is no easy way to do a global rescan/reload of all of
Lucene's SPIs in the right order. In solr there is a long list of reload
instructions in the ResourceLoader. If somebody adds a new SPI type, you have
to add it there.

It would be good to java a central instance in oal.util that keeps track of all
NamedSPILoaders and AnalysisSPILoaders (in the order they were instantiated),
so you have one central entry point to trigger a reload.

This issue will introduce:
- A singleton that makes reloading possible. The singleton keeps weak refs to
all loaders (of any kind) in the order they were created.
- NamedSPILoader and AnalysisSPILoader cleanup (unfortunately we need both
instances, as they differ in the internals (one keeps classes, the other one
instances). Both should implement a reloadable interface.

was:
Currently there is no easy way to do a global rescan/reload of all of
Lucene's SPIs in the right order. In solr there is a long list of reload
instructions in the ResourceLoader. If somebody adds a new SPI type, you have
to add it there.

This issue will introduce:
- A singleton that makes reloading possible. The singleton keeps weak refs to
all loaders (of any kind) in the order they were created.
- NamedSPILoader and AnalysisSPILoader (unfortunately we need both instances,
as they differ in the internals (one keeps classes, the other one instances).
Both should implement a reloadable interface.

Add a separate registration singleton for Lucene's SPI, so there is only
one central instance to request rescanning of classpath (e.g. from Solr's
ResourceLoader)

Key: LUCENE-4823
URL: https://issues.apache.org/jira/browse/LUCENE-4823
Project: Lucene - Core
Issue Type: Bug
Components: core/other
Reporter: Uwe Schindler
Assignee: Uwe Schindler
Fix For: 5.0, 4.3

Currently there is no easy way to do a global rescan/reload of all of
Lucene's SPIs in the right order. In solr there is a long list of reload
instructions in the ResourceLoader. If somebody adds a new SPI type, you have
to add it there.
It would be good to java a central instance in oal.util that keeps track of
all NamedSPILoaders and AnalysisSPILoaders (in the order they were
instantiated), so you have one central entry point to trigger a reload.
This issue will introduce:
- A singleton that makes reloading possible. The singleton keeps weak refs to
all loaders (of any kind) in the order they were created.
- NamedSPILoader and AnalysisSPILoader cleanup (unfortunately we need both
instances, as they differ in the internals (one keeps classes, the other one
instances). Both should implement a reloadable interface.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4642) Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource


 [ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-4642:
---

Summary: Add create(AttributeFactory) to TokenizerFactory and subclasses 
with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' 
ctors taking AttributeSource  (was: TokenizerFactory should provide a create 
method with a given AttributeSource)

 Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors 
 taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking 
 AttributeSource
 ---

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.3

 Attachments: LUCENE-4642.patch, LUCENE-4642.patch, LUCENE-4642.patch, 
 TrieTokenizerFactory.java.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory 
 does not provide an API to create tokenizers with a given AttributeSource.
 Side note: There are still a lot of tokenizers that do not provide 
 constructors that take AttributeSource and AttributeFactory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2595) Split and migrate indexes

2013-03-12 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-2595.
-

Resolution: Duplicate

 Split and migrate indexes
 -

 Key: SOLR-2595
 URL: https://issues.apache.org/jira/browse/SOLR-2595
 Project: Solr
  Issue Type: New Feature
  Components: multicore, replication (java), SolrCloud
Reporter: Shalin Shekhar Mangar
 Fix For: 4.3


 When an shard's index grows too large or a shard becomes too loaded, it 
 should be possible to split parts of a shard's index and migrate/merge to a 
 less loaded node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-2593) A new core admin action 'split' for splitting index

2013-03-12 Thread Shalin Shekhar Mangar (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-2593.
-

Resolution: Duplicate

Committed as part of SOLR-3755 changes.

 A new core admin action 'split' for splitting index
 ---

 Key: SOLR-2593
 URL: https://issues.apache.org/jira/browse/SOLR-2593
 Project: Solr
  Issue Type: New Feature
Reporter: Noble Paul
 Fix For: 4.3


 If an index is too large/hot it would be desirable to split it out to another 
 core .
 This core may eventually be replicated out to another host.
 There can be to be multiple strategies 
 * random split of x or x% 
 * fq=user:johndoe
 example :
 action=splitsplit=20percentnewcore=my_new_index
 or
 action=splitfq=user:johndoenewcore=john_doe_index

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968
 ] 

Christian Kohlschütter edited comment on LUCENE-4713 at 3/12/13 12:16 PM:
--

Overlap and coverage, Uwe :)
Looks good to me!

Nit: What you could do to be 100% safe that we're using the correct 
ClassLoader is to check for loader==null in SPIClassIterator and assign it to 
ClassLoader.getSystemClassLoader() in this case.


  was (Author: c...@newsclub.de):
Overlap and coverage, Uwe :)
Looks good to me!

  
 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4642) Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource


 [ 
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated LUCENE-4642:
---

Attachment: LUCENE-4642.patch

Patch, narrows one or two more create(AttributeFactory) return types, minor 
cosmetic mods, removed unused imports.

Committing shortly.

 Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors 
 taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking 
 AttributeSource
 ---

 Key: LUCENE-4642
 URL: https://issues.apache.org/jira/browse/LUCENE-4642
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
  Labels: analysis, attribute, tokenizer
 Fix For: 4.3

 Attachments: LUCENE-4642.patch, LUCENE-4642.patch, LUCENE-4642.patch, 
 LUCENE-4642.patch, TrieTokenizerFactory.java.patch


 All tokenizer implementations have a constructor that takes a given 
 AttributeSource as parameter (LUCENE-1826).  These should be removed.
 TokenizerFactory does not provide an API to create tokenizers with a given 
 AttributeFactory, but quite a few tokenizers have constructors that take an 
 AttributeFactory.  TokenizerFactory should add a create(AttributeFactory) 
 method, as should subclasses for tokenizers with AttributeFactory accepting 
 ctors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken


[ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600078#comment-13600078
 ] 

Sudheer Prem edited comment on SOLR-4561 at 3/12/13 2:54 PM:
-

I have a scenario where table A contain 5 million rows and table B contain more 
than a million rows. The join condition matches for only a couple of thousands 
of records. I had been using this feature in earlier version of Solr. Suddenly 
due to this change, it took the wrong join (one which matches the first 
condition) and populate that value to all documents.

After debugging, my thought for the fix is like this:

This is happening because, in the method SqlEntityProcessor.nextRow(), the 
query is initialized and loaded only if the the rowIterator is null. Actually, 
the query should be initialized if the query is different than the previous 
query. If the logic is changed in that way, i think this issue will be fixed.
To apply this logic, change the SqlEntityProcessor.nextRow() method from 

{code}
if (rowIterator == null) {
  String q = getQuery();
  initQuery(context.replaceTokens(q));
}
{code}

to the code mentioned below:

{code}
String q = context.replaceTokens(getQuery());
if(!q.equals(this.query)){
  initQuery(q);
}
{code}

Initial testing shows that, it seems working as expected.


  was (Author: sudheerprem):
I have a scenario where table A contain 5 million rows and table B contain 
more than a million rows. The join condition matches for only a couple of 
thousands of records. I had been using this feature in earlier version of Solr. 
Suddenly due to this change, it took the wrong join (one which matches the 
first condition) and populate that value to all documents.

After debugging, my thought for the fix is like this:

This is happening because, in the method SqlEntityProcessor.nextRow(), the 
query is initialized and loaded only if the the rowIterator is null. Actually, 
the query should be initialized if the query is different than the previous 
query. If the logic is changed in that way, i think this issue will be fixed.
To apply this logic, change the SqlEntityProcessor.nextRow() method from 

{code}
if (rowIterator == null) {
  String q = getQuery();
  initQuery(context.replaceTokens(q));
}
{code}

to the code mentioned below:

{code}
String q = context.replaceTokens(getQuery());
if(!q.equals(this.query)){
  initQuery(context.replaceTokens(q));
}
{code}

Initial testing shows that, it seems working as expected.

  
 CachedSqlEntityProcessor with parametarized query is broken
 ---

 Key: SOLR-4561
 URL: https://issues.apache.org/jira/browse/SOLR-4561
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Sudheer Prem
   Original Estimate: 1m
  Remaining Estimate: 1m

 When child entities are created and the child entity is provided with a 
 parametrized query as below, 
 {code:xml} 
 entity name=x query=select * from x
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 /entity
 entity
 {code} 
 the Entity Processor always return the result from the fist query even though 
 the parameter is changed, It is happening because, 
 EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
 after calling DIHCacheSupport.getCacheData() method.
 This can be fixed by changing the else block in getNext() method of 
 EntityProcessorBase from
 {code} 
 else  {
   return cacheSupport.getCacheData(context, query, rowIterator);
   
 }
 {code} 
 to the code mentioned below:
 {code} 
 else  {
   MapString,Object cacheData = cacheSupport.getCacheData(context, 
 query, rowIterator);
   query = null;
   rowIterator = null;
   return cacheData;
 }
 {code}   
 Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4557) Fix broken CoreContainerTest.testReload

2013-03-12 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600154#comment-13600154
 ] 

Erick Erickson commented on SOLR-4557:
--

trunk r: 1455606. fixed the root cause of the tests failing, also took more 
care with the core reloads so they don't happen simultaneously with 
loads/unloads.

 Fix broken CoreContainerTest.testReload
 ---

 Key: SOLR-4557
 URL: https://issues.apache.org/jira/browse/SOLR-4557
 Project: Solr
  Issue Type: Test
Affects Versions: 4.2, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
 Attachments: SOLR-4557.patch, SOLR-4557_posthshutdown_stack.txt


 I was chasing down a test failure, and it turns out that 
 CoreContainerTest.testReload has only succeeded by chance. The test fires up 
 4 threads that go out and reload the same core all at once. This caused me to 
 look at properly synchronizing reloading cores pursuant to SOLR-4196, on the 
 theory that we should serialize loading, unloading and reloading cores; we 
 shouldn't be doing _any_ of those operations from different threads on the 
 same core at the same time. It turns out that if you fire up multiple reloads 
 at once without serializing them, an error is thrown instead of proper 
 reloading occurring, and that's the only reason the test doesn't hang. The 
 stack trace of the exception is below for reference, but it doesn't with the 
 code I'll attach to this patch:
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
 [junit4:junit4]   2  at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:536)
 [junit4:junit4]   2  at 
 org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:138)
 [junit4:junit4]   2  at 
 org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51)
 [junit4:junit4]   2  at 
 org.apache.solr.core.RequestHandlers.register(RequestHandlers.java:106)
 [junit4:junit4]   2  at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157)
 [junit4:junit4]   2  at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:757)
 [junit4:junit4]   2  at 
 org.apache.solr.core.SolrCore.reload(SolrCore.java:408)
 [junit4:junit4]   2  at 
 org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1076)
 [junit4:junit4]   2  at 
 org.apache.solr.core.TestCoreContainer$1TestThread.run(TestCoreContainer.java:90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4557) Fix broken CoreContainerTest.testReload

2013-03-12 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600154#comment-13600154
 ] 

Erick Erickson commented on SOLR-4557:
--

trunk r: 1455606. fixed the root cause of the tests failing, also took more 
care with the core reloads so they don't happen simultaneously with 
loads/unloads.

 Fix broken CoreContainerTest.testReload
 ---

 Key: SOLR-4557
 URL: https://issues.apache.org/jira/browse/SOLR-4557
 Project: Solr
  Issue Type: Test
Affects Versions: 4.2, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
 Attachments: SOLR-4557.patch, SOLR-4557_posthshutdown_stack.txt


 I was chasing down a test failure, and it turns out that 
 CoreContainerTest.testReload has only succeeded by chance. The test fires up 
 4 threads that go out and reload the same core all at once. This caused me to 
 look at properly synchronizing reloading cores pursuant to SOLR-4196, on the 
 theory that we should serialize loading, unloading and reloading cores; we 
 shouldn't be doing _any_ of those operations from different threads on the 
 same core at the same time. It turns out that if you fire up multiple reloads 
 at once without serializing them, an error is thrown instead of proper 
 reloading occurring, and that's the only reason the test doesn't hang. The 
 stack trace of the exception is below for reference, but it doesn't with the 
 code I'll attach to this patch:
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
 [junit4:junit4]   2  at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:536)
 [junit4:junit4]   2  at 
 org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:138)
 [junit4:junit4]   2  at 
 org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51)
 [junit4:junit4]   2  at 
 org.apache.solr.core.RequestHandlers.register(RequestHandlers.java:106)
 [junit4:junit4]   2  at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157)
 [junit4:junit4]   2  at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:757)
 [junit4:junit4]   2  at 
 org.apache.solr.core.SolrCore.reload(SolrCore.java:408)
 [junit4:junit4]   2  at 
 org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1076)
 [junit4:junit4]   2  at 
 org.apache.solr.core.TestCoreContainer$1TestThread.run(TestCoreContainer.java:90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599975#comment-13599975
 ] 

Uwe Schindler commented on LUCENE-4713:
---

I also opened LUCENE-4823 to make the reloading (which is done on Solr startup 
to load codecs from plugin folders) more centralized. This is not really 
related but might move the isParentClassLoader helper method into the new base 
class for all SPILoaders (and hide it).

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-3857) DIH: SqlEntityProcessor with simple cache broken


[ 
https://issues.apache.org/jira/browse/SOLR-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600088#comment-13600088
 ] 

Sudheer Prem commented on SOLR-3857:


Updated SOLR-4561 with a valid fix.

 DIH: SqlEntityProcessor with simple cache broken
 --

 Key: SOLR-3857
 URL: https://issues.apache.org/jira/browse/SOLR-3857
 Project: Solr
  Issue Type: Bug
Affects Versions: 3.6.1, 4.0-BETA
Reporter: James Dyer

 The wiki describes a usage of CachedSqlEntityProcessor like this:
 {code:xml}
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 {code}
 This creates what the code refers as a simple cache.  Rather than build the 
 entire cache up-front, the cache is built on-the-go.  I think this has 
 limited use cases but it would be nice to preserve the feature if possible.
 Unfortunately this was not included in any (effective) unit tests, and 
 SOLR-2382 entirely broke the functionality for 3.6/4.0-alpha+ .  At a first 
 glance, the fix may not be entirely straightforward.
 This was found while writing tests for SOLR-3856.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4562) core selector not working in Chrome


 [ 
https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Lizewski updated SOLR-4562:
--

Attachment: Przechwytywanie.PNG

 core selector not working in Chrome
 ---

 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Maciej Lizewski
 Attachments: Przechwytywanie.PNG


 after fresh installation of Solr 4.2 on windows 7 64bit
 I do not see any cores in Google Chrome to select in combobox. Also - when 
 trying to prepare URI by hand - I see error that there is no such core. In 
 FireFox - there is default 'collection1' core visible without problems.
 My Chrome version: 26.0.1410.28 beta-m
 I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4562) core selector not working in Chrome

Maciej Lizewski created SOLR-4562:
-

 Summary: core selector not working in Chrome
 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Maciej Lizewski


after fresh installation of Solr 4.2 on windows 7 64bit
I do not see any cores in Google Chrome to select in combobox. Also - when 
trying to prepare URI by hand - I see error that there is no such core. In 
FireFox - there is default 'collection1' core visible without problems.

My Chrome version: 26.0.1410.28 beta-m
I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4564) Admin UI fails to load properly on Chrome

2013-03-12 Thread Aditya (JIRA)

Aditya created SOLR-4564:


 Summary: Admin UI fails to load properly on Chrome
 Key: SOLR-4564
 URL: https://issues.apache.org/jira/browse/SOLR-4564
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.2
 Environment: Jboss 7.1.1 and Solr 4.2 
Reporter: Aditya


Admin UI fails to load collection list on Chrome. The dropdown is empty. 
Clicking on Logging and Threads throws javascript error in console. 

GET http://10.124.55.84/solr/undefined/admin/logging?wt=jsonsince=0 404 (Not 
Found) {require.js:10157}
GET http://10.124.55.84/solr/undefined/admin/threads?wt=json 404 (Not Found) 
require.js:10157

Checked on IE9 and the UI looks good. but Schema browser is sluggish. whiel 
searching fields. Every keystroke creates a pause for field look-up. We have 
around 290 fields (including dynamic) defined in schema. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-03-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600026#comment-13600026
 ] 

Robert Muir commented on LUCENE-4795:
-

{quote}
You didn't answer my question though, and perhaps it doesn't belong in this 
issue, but is there a way to utilize the ordinal given to a DV value somehow? 
Or is it internal to the SortedSet DV?
{quote}

Because I don't want to encourage crazy software designs to support fringe 
features. Want weighted faceting? use the tax index: pretty simple. 



 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)   52.40  (2.7%)   
 -1.4% (  -7% -5%)
  LowSpanNear8.42  (3.2%)8.45  (3.0%)
 0.3% (  -5% -6%)
  Respell   45.17  (4.3%)   45.38  (4.4%)
 0.5% (  -7% -9%)
MedPhrase  113.93  (5.8%)  115.02  (4.9%)
 1.0% (  -9% -   12%)
   AndHighLow  596.42  (2.5%)  617.12  (2.8%)
 3.5% (  -1% -8%)
   HighPhrase   17.30 (10.5%)   18.36  (9.1%)

[jira] [Updated] (LUCENE-4642) Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource

[
https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steve Rowe updated LUCENE-4642:
---

Description:
All tokenizer implementations have a constructor that takes a given
AttributeSource as parameter (LUCENE-1826). These should be removed.

TokenizerFactory does not provide an API to create tokenizers with a given
AttributeFactory, but quite a few tokenizers have constructors that take an
AttributeFactory. TokenizerFactory should add a create(AttributeFactory)
method, as should subclasses for tokenizers with AttributeFactory accepting
ctors.

was:
All tokenizer implementations have a constructor that takes a given
AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory does
not provide an API to create tokenizers with a given AttributeSource.

Side note: There are still a lot of tokenizers that do not provide constructors
that take AttributeSource and AttributeFactory.

Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors
taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking
AttributeSource
---

Key: LUCENE-4642
URL: https://issues.apache.org/jira/browse/LUCENE-4642
Project: Lucene - Core
Issue Type: Improvement
Components: modules/analysis
Affects Versions: 4.1
Reporter: Renaud Delbru
Assignee: Steve Rowe
Labels: analysis, attribute, tokenizer
Fix For: 4.3

Attachments: LUCENE-4642.patch, LUCENE-4642.patch, LUCENE-4642.patch,
TrieTokenizerFactory.java.patch

All tokenizer implementations have a constructor that takes a given
AttributeSource as parameter (LUCENE-1826). These should be removed.
TokenizerFactory does not provide an API to create tokenizers with a given
AttributeFactory, but quite a few tokenizers have constructors that take an
AttributeFactory. TokenizerFactory should add a create(AttributeFactory)
method, as should subclasses for tokenizers with AttributeFactory accepting
ctors.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 313 - Failure!

2013-03-12 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/313/
Java: 64bit/jdk1.7.0 -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 26442 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] Loading source files for package org.apache.lucene...
  [javadoc] Loading source files for package org.apache.lucene.analysis...
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.tokenattributes...
  [javadoc] Loading source files for package org.apache.lucene.codecs...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.compressing...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene40...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene41...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.lucene42...
  [javadoc] Loading source files for package 
org.apache.lucene.codecs.perfield...
  [javadoc] Loading source files for package org.apache.lucene.document...
  [javadoc] Loading source files for package org.apache.lucene.index...
  [javadoc] Loading source files for package org.apache.lucene.search...
  [javadoc] Loading source files for package 
org.apache.lucene.search.payloads...
  [javadoc] Loading source files for package 
org.apache.lucene.search.similarities...
  [javadoc] Loading source files for package org.apache.lucene.search.spans...
  [javadoc] Loading source files for package org.apache.lucene.store...
  [javadoc] Loading source files for package org.apache.lucene.util...
  [javadoc] Loading source files for package org.apache.lucene.util.automaton...
  [javadoc] Loading source files for package org.apache.lucene.util.fst...
  [javadoc] Loading source files for package org.apache.lucene.util.mutable...
  [javadoc] Loading source files for package org.apache.lucene.util.packed...
  [javadoc] Constructing Javadoc information...
  [javadoc] Standard Doclet version 1.7.0_15
  [javadoc] Building tree for all the packages and classes...
  [javadoc] Generating 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/org/apache/lucene/search/package-summary.html...
  [javadoc] Copying file 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png
 to directory 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Copying file 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png
 to directory 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/org/apache/lucene/search/doc-files...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 
/Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/help-doc.html...
  [javadoc] 1 warning

[...truncated 33 lines...]
  [javadoc] Generating Javadoc
  [javadoc] Javadoc execution
  [javadoc] warning: [options] bootstrap class path not set in conjunction with 
-source 1.6
  [javadoc] Loading source files for package org.apache.lucene.analysis.ar...
  [javadoc] Loading source files for package org.apache.lucene.analysis.bg...
  [javadoc] Loading source files for package org.apache.lucene.analysis.br...
  [javadoc] Loading source files for package org.apache.lucene.analysis.ca...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.charfilter...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cjk...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.commongrams...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound...
  [javadoc] Loading source files for package 
org.apache.lucene.analysis.compound.hyphenation...
  [javadoc] Loading source files for package org.apache.lucene.analysis.core...
  [javadoc] Loading source files for package org.apache.lucene.analysis.cz...
  [javadoc] Loading source files for package org.apache.lucene.analysis.da...
  [javadoc] Loading source files for package org.apache.lucene.analysis.de...
  [javadoc] Loading source files for package org.apache.lucene.analysis.el...
  [javadoc] Loading source files for package org.apache.lucene.analysis.en...
  [javadoc] Loading source files for package org.apache.lucene.analysis.es...
  [javadoc] Loading source files for package org.apache.lucene.analysis.eu...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fa...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fi...
  [javadoc] Loading source files for package org.apache.lucene.analysis.fr...
  [javadoc] Loading source files for package

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues

2013-03-12 Thread Shai Erera (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1350#comment-1350
 ] 

Shai Erera commented on LUCENE-4795:


Thanks. Also (sorry that it comes in parts), I find this confusing: {{new 
SortedSetDocValuesField(myfacets, new BytesRef(a + sep + foo))}}. The 
user needs to decide under which field all facets will be indexed. This could 
lead users to do {{new SSDVF(author, new BytesRef(shai))}} and {{new 
SSDVF(date, new BytesRef(2010/March/13))}}. We know, from past results, 
that this will result in worse search performance. Also, this doesn't take a CP 
which is not consistent e.g. with the FacetRequest, where you need to pass a 
CP. So rather perhaps we should:

* Add a FacetField (extends SSDVF) which takes a CP (potentially 
FacetIndexingParams as well).
* It will call super(CLP.DEFAULT_FIELD, new BytesRef(cp.toString())) (we can 
optimize that later, e.g. have CP expose a BytesRef API too if we want).
* Potentially, allow (or not) to define the field type.

What do you think?

 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)   
 -3.5% (  -5% -   -1%)
  LowSloppyPhrase   20.50  (2.0%)   20.09  (4.2%)   
 -2.0% (  -8% -4%)
LowPhrase   21.60  (6.0%)   21.26  (5.1%)   
 -1.6% ( -11% -   10%)
   Fuzzy2   53.16  (3.9%)

[jira] [Comment Edited] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken


[ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600078#comment-13600078
 ] 

Sudheer Prem edited comment on SOLR-4561 at 3/12/13 5:06 PM:
-

I have a scenario where table A contain 5 million rows and table B contain more 
than a million rows. The join condition matches for only a couple of thousands 
of records. I had been using this feature in earlier version of Solr. Suddenly 
due to this change, it took the wrong join (one which matches the first 
condition) and populate that value to all documents.

After debugging, my thought for the fix is like this:

This is happening because, in the method SqlEntityProcessor.nextRow(), the 
query is initialized and loaded only if the the rowIterator is null. Actually, 
the query can be initialized if the query is different than the previous query. 
If the logic is changed in that way, i think this issue will be fixed.
To apply this logic, change the SqlEntityProcessor.nextRow() method from 

{code}
if (rowIterator == null) {
  String q = getQuery();
  initQuery(context.replaceTokens(q));
}
{code}

to the code mentioned below:

{code}
String q = context.replaceTokens(getQuery());
if(!q.equals(this.query)){
  initQuery(q);
}
{code}

Initial testing shows that, it seems working as expected.


  was (Author: sudheerprem):
I have a scenario where table A contain 5 million rows and table B contain 
more than a million rows. The join condition matches for only a couple of 
thousands of records. I had been using this feature in earlier version of Solr. 
Suddenly due to this change, it took the wrong join (one which matches the 
first condition) and populate that value to all documents.

After debugging, my thought for the fix is like this:

This is happening because, in the method SqlEntityProcessor.nextRow(), the 
query is initialized and loaded only if the the rowIterator is null. Actually, 
the query should be initialized if the query is different than the previous 
query. If the logic is changed in that way, i think this issue will be fixed.
To apply this logic, change the SqlEntityProcessor.nextRow() method from 

{code}
if (rowIterator == null) {
  String q = getQuery();
  initQuery(context.replaceTokens(q));
}
{code}

to the code mentioned below:

{code}
String q = context.replaceTokens(getQuery());
if(!q.equals(this.query)){
  initQuery(q);
}
{code}

Initial testing shows that, it seems working as expected.

  
 CachedSqlEntityProcessor with parametarized query is broken
 ---

 Key: SOLR-4561
 URL: https://issues.apache.org/jira/browse/SOLR-4561
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Sudheer Prem
   Original Estimate: 1m
  Remaining Estimate: 1m

 When child entities are created and the child entity is provided with a 
 parametrized query as below, 
 {code:xml} 
 entity name=x query=select * from x
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 /entity
 entity
 {code} 
 the Entity Processor always return the result from the fist query even though 
 the parameter is changed, It is happening because, 
 EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
 after calling DIHCacheSupport.getCacheData() method.
 This can be fixed by changing the else block in getNext() method of 
 EntityProcessorBase from
 {code} 
 else  {
   return cacheSupport.getCacheData(context, query, rowIterator);
   
 }
 {code} 
 to the code mentioned below:
 {code} 
 else  {
   MapString,Object cacheData = cacheSupport.getCacheData(context, 
 query, rowIterator);
   query = null;
   rowIterator = null;
   return cacheData;
 }
 {code}   
 Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken


[ 
https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600078#comment-13600078
 ] 

Sudheer Prem commented on SOLR-4561:


I have a scenario where table A contain 5 million rows and table B contain more 
than a million rows. The join condition matches for only a couple of thousands 
of records. I had been using this feature in earlier version of Solr. Suddenly 
due to this change, it took the wrong join (one which matches the first 
condition) and populate that value to all documents.

After debugging, my thought for the fix is like this:

This is happening because, in the method SqlEntityProcessor.nextRow(), the 
query is initialized and loaded only if the the rowIterator is null. Actually, 
the query should be initialized if the query is different than the previous 
query. If the logic is changed in that way, i think this issue will be fixed.
To apply this logic, change the SqlEntityProcessor.nextRow() method from 

{code}
if (rowIterator == null) {
  String q = getQuery();
  initQuery(context.replaceTokens(q));
}
{code}

to the code mentioned below:

{code}
String q = context.replaceTokens(getQuery());
if(!q.equals(this.query)){
  initQuery(context.replaceTokens(q));
}
{code}

Initial testing shows that, it seems working as expected.


 CachedSqlEntityProcessor with parametarized query is broken
 ---

 Key: SOLR-4561
 URL: https://issues.apache.org/jira/browse/SOLR-4561
 Project: Solr
  Issue Type: Bug
  Components: contrib - DataImportHandler
Affects Versions: 4.1
Reporter: Sudheer Prem
   Original Estimate: 1m
  Remaining Estimate: 1m

 When child entities are created and the child entity is provided with a 
 parametrized query as below, 
 {code:xml} 
 entity name=x query=select * from x
 entity name=y query=select * from y where xid=${x.id} 
 processor=CachedSqlEntityProcessor
 /entity
 entity
 {code} 
 the Entity Processor always return the result from the fist query even though 
 the parameter is changed, It is happening because, 
 EntityProcessorBase.getNext() method doesn't reset the query and rowIterator 
 after calling DIHCacheSupport.getCacheData() method.
 This can be fixed by changing the else block in getNext() method of 
 EntityProcessorBase from
 {code} 
 else  {
   return cacheSupport.getCacheData(context, query, rowIterator);
   
 }
 {code} 
 to the code mentioned below:
 {code} 
 else  {
   MapString,Object cacheData = cacheSupport.getCacheData(context, 
 query, rowIterator);
   query = null;
   rowIterator = null;
   return cacheData;
 }
 {code}   
 Update: But then, the caching doesn't seem to be working...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4557) Fix broken CoreContainerTest.testReload

2013-03-12 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-4557:
-

Attachment: SOLR-4557.patch

Fix for trunk corresponding to the checkin.

 Fix broken CoreContainerTest.testReload
 ---

 Key: SOLR-4557
 URL: https://issues.apache.org/jira/browse/SOLR-4557
 Project: Solr
  Issue Type: Test
Affects Versions: 4.2, 5.0
Reporter: Erick Erickson
Assignee: Erick Erickson
 Attachments: SOLR-4557.patch, SOLR-4557.patch, 
 SOLR-4557_posthshutdown_stack.txt


 I was chasing down a test failure, and it turns out that 
 CoreContainerTest.testReload has only succeeded by chance. The test fires up 
 4 threads that go out and reload the same core all at once. This caused me to 
 look at properly synchronizing reloading cores pursuant to SOLR-4196, on the 
 theory that we should serialize loading, unloading and reloading cores; we 
 shouldn't be doing _any_ of those operations from different threads on the 
 same core at the same time. It turns out that if you fire up multiple reloads 
 at once without serializing them, an error is thrown instead of proper 
 reloading occurring, and that's the only reason the test doesn't hang. The 
 stack trace of the exception is below for reference, but it doesn't with the 
 code I'll attach to this patch:
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427)
 [junit4:junit4]   2  at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415)
 [junit4:junit4]   2  at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:536)
 [junit4:junit4]   2  at 
 org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:138)
 [junit4:junit4]   2  at 
 org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51)
 [junit4:junit4]   2  at 
 org.apache.solr.core.RequestHandlers.register(RequestHandlers.java:106)
 [junit4:junit4]   2  at 
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157)
 [junit4:junit4]   2  at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:757)
 [junit4:junit4]   2  at 
 org.apache.solr.core.SolrCore.reload(SolrCore.java:408)
 [junit4:junit4]   2  at 
 org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1076)
 [junit4:junit4]   2  at 
 org.apache.solr.core.TestCoreContainer$1TestThread.run(TestCoreContainer.java:90)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4563) RSS DIH-example not working

Jan Høydahl created SOLR-4563:
-

 Summary: RSS DIH-example not working
 Key: SOLR-4563
 URL: https://issues.apache.org/jira/browse/SOLR-4563
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Jan Høydahl
 Fix For: 4.3, 5.0
 Attachments: SOLR-4563.patch

The xpath paths of /rss/item do not match the real world RSS feed which uses 
/rss/channel/item

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails


[ 
https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599977#comment-13599977
 ] 

Uwe Schindler commented on LUCENE-4713:
---

bq. Nit: What you could do to be 100% safe that we're using the correct 
ClassLoader is to check for loader==null in SPIClassIterator and assign it to 
ClassLoader.getSystemClassLoader() in this case.

I want to keep as close to Java's original. This is not a problem at all: 
Class.forName(name, ..., NULL) loads automatically using the bootstrap / system 
loader.

 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader 
 fails
 

 Key: LUCENE-4713
 URL: https://issues.apache.org/jira/browse/LUCENE-4713
 Project: Lucene - Core
  Issue Type: Improvement
Affects Versions: 4.0, 4.1, 4.2
Reporter: Christian Kohlschütter
Assignee: Uwe Schindler
Priority: Minor
  Labels: ClassLoader, Thread
 Fix For: 4.3

 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, 
 LuceneContextClassLoader.patch


 NOTE: This issue has been renamed from:
 Replace calls to Thread#getContextClassLoader with the ClassLoader of the 
 current class
 because the revised patch provides a clean fallback path.
 I am not sure whether it is a design decision or if we can indeed consider 
 this a bug:
 In core and analysis-common some classes provide on-demand class loading 
 using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and 
 AnalysisSPILoader there are constructors that use the Thread's context 
 ClassLoader by default whenever no particular other ClassLoader was specified.
 Unfortunately this does not work as expected when the Thread's ClassLoader 
 can't see the required classes that are instantiated downstream with the help 
 of Class.forName (e.g., Codecs, Analyzers, etc.).
 That's what happened to us here. We currently experiment with running Lucene 
 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each 
 seeing only the corresponding Lucene version and the upstream classpath.
 While NamedSPILoader and company get successfully loaded by our custom 
 ClassLoader, their instantiation fails because our Thread's 
 Context-ClassLoader cannot find the additionally required classes.
 We could probably work-around this by using Thread#setContextClassLoader at 
 construction time (and quickly reverting back afterwards), but I have the 
 impression this might just hide the actual problem and cause further trouble 
 when lazy-loading classes later on, and potentially from another Thread.
 Removing the call to Thread#getContextClassLoader would also align with the 
 behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses 
 Attribute#getClass().getClassLoader() instead.
 A simple patch is attached. All tests pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [ANNOUNCE] Apache Solr 4.2 released

2013-03-12 Thread Marthi, Suneel

We presently have Indexes generated from Solr 4.1.  What is the upgrade
path to Solr 4.2 ?



On 3/11/13 8:37 PM, Robert Muir rm...@apache.org wrote:

March 2013, Apache Solr 4.2 available
The Lucene PMC is pleased to announce the release of Apache Solr 4.2

Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document (e.g., Word, PDF)
handling, and geospatial search.  Solr is highly scalable, providing
fault tolerant distributed search and indexing, and powers the search
and navigation features of many of the world's largest internet sites.

Solr 4.2 is available for immediate download at:
   http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

See the CHANGES.txt file included with the release for a full list of
details.

Solr 4.2 Release Highlights:

* A read side REST API for the schema. Always wanted to introspect the
schema over http? Now you can. Looks like the write side will be
coming next.

* DocValues have been integrated into Solr. DocValues can be loaded up
a lot faster than the field cache and can also use different
compression algorithms as well as in RAM or on Disk representations.
Faceting, sorting, and function queries all get to benefit. How about
the OS handling faceting and sorting caches off heap? No more tuning
60 gigabyte heaps? How about a snappy new per segment DocValues
faceting method? Improved numeric faceting? Sweet.

* Collection Aliasing. Got time based data? Want to re-index in a
temporary collection and then swap it into production? Done. Stay
tuned for Shard Aliasing.

* Collection API responses. The collections API was still very new in
4.0, and while it improved a fair bit in 4.1, responses were certainly
needed, but missed the cut off. Initially, we made the decision to
make the Collection API super fault tolerant, which made responses
tougher to do. No one wants to hunt through logs files to see how
things turned out. Done in 4.2.

* Interact with any collection on any node. Until 4.2, you could only
interact with a node in your cluster if it hosted at least one replica
of the collection you wanted to query/update. No longer - query any
node, whether it has a piece of your intended collection or not and
get a proxied response.

* Allow custom shard names so that new host addresses can take over
for retired shards. Working on Amazon without elastic ips? This is for
you.

* Lucene 4.2 optimizations such as compressed term vectors.

Solr 4.2 also includes many other new features as well as numerous
optimizations and bugfixes.

Please report any feedback to the mailing lists
(http://lucene.apache.org/solr/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring
network for distributing releases.  It is possible that the mirror you
are using may not have replicated the release yet.  If that is the
case, please try another mirror.  This also goes for Maven access.

Happy searching,
Lucene/Solr developers


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4563) RSS DIH-example not working

2013-03-12 Thread Walter Underwood (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600234#comment-13600234
 ] 

Walter Underwood commented on SOLR-4563:


Given the wild variety of things called RSS, it is probably a better idea to 
parse Atom.

 RSS DIH-example not working
 ---

 Key: SOLR-4563
 URL: https://issues.apache.org/jira/browse/SOLR-4563
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Jan Høydahl
 Fix For: 4.3, 5.0

 Attachments: SOLR-4563.patch


 The xpath paths of /rss/item do not match the real world RSS feed which uses 
 /rss/channel/item

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4562) core selector not working in Chrome

2013-03-12 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600249#comment-13600249
 ] 

Mark Miller commented on SOLR-4562:
---

I've seen this work on chrome in linux and osx as a data point.

 core selector not working in Chrome
 ---

 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Maciej Lizewski
 Attachments: Przechwytywanie.PNG


 after fresh installation of Solr 4.2 on windows 7 64bit
 I do not see any cores in Google Chrome to select in combobox. Also - when 
 trying to prepare URI by hand - I see error that there is no such core. In 
 FireFox - there is default 'collection1' core visible without problems.
 My Chrome version: 26.0.1410.28 beta-m
 I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Improving DirectSpellChecker

2013-03-12 Thread Robert Muir

On Tue, Mar 12, 2013 at 9:39 AM, Varun Thacker
varunthacker1...@gmail.com wrote:
 Actually that was what I ended up doing although I thought this approach
 could have it's merits.

 Just for argument's sake, if we could have complex analyzers on a field
 wouldn't it have better recall for spell suggestions sacrificing on the
 precision although. Would that be a bad idea? Also DirectSpellChecker is
 probably not where this should be in. Maybe in SpellChecker or a new spell
 checker. Or do you think it's possible that something like this should sit
 outside lucene.

I think the idea makes sense (basically it would be like
analyzing/fuzzysuggester, but for spellchecking?)
So it could use maybe even the same datastructures but different logic.

This means someone could use it to do spellchecking (not just suggest)
on languages like japanese too.

So this would be a really nice option to add in my opinion.

But directspellchecker is pretty simple and limited essentially by
what the term dictionary can do.
So you cant use fancy datastructures like FST weights, thats why i was
confused about the email.

The overall approach is a good idea though.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4564) Admin UI fails to load properly on Chrome


[ 
https://issues.apache.org/jira/browse/SOLR-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600262#comment-13600262
 ] 

Steve Rowe commented on SOLR-4564:
--

I think this is a duplicate of SOLR-4562 - Aditya, what version of Windows?

 Admin UI fails to load properly on Chrome
 -

 Key: SOLR-4564
 URL: https://issues.apache.org/jira/browse/SOLR-4564
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.2
 Environment: Jboss 7.1.1 and Solr 4.2 
Reporter: Aditya

 Admin UI fails to load collection list on Chrome. The dropdown is empty. 
 Clicking on Logging and Threads throws javascript error in console. 
 GET http://10.124.55.84/solr/undefined/admin/logging?wt=jsonsince=0 404 (Not 
 Found) {require.js:10157}
 GET http://10.124.55.84/solr/undefined/admin/threads?wt=json 404 (Not Found) 
 require.js:10157
 Checked on IE9 and the UI looks good. but Schema browser is sluggish. whiel 
 searching fields. Every keystroke creates a pause for field look-up. We have 
 around 290 fields (including dynamic) defined in schema. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4562) core selector not working in Chrome


[ 
https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600263#comment-13600263
 ] 

Steve Rowe commented on SOLR-4562:
--

SOLR-4564 looks like it's the same issue.

 core selector not working in Chrome
 ---

 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Maciej Lizewski
 Attachments: Przechwytywanie.PNG


 after fresh installation of Solr 4.2 on windows 7 64bit
 I do not see any cores in Google Chrome to select in combobox. Also - when 
 trying to prepare URI by hand - I see error that there is no such core. In 
 FireFox - there is default 'collection1' core visible without problems.
 My Chrome version: 26.0.1410.28 beta-m
 I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4562) core selector not working in Chrome

2013-03-12 Thread Stefan Matheis (steffkes) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600290#comment-13600290
 ] 

Stefan Matheis (steffkes) commented on SOLR-4562:
-

[~redguy666] Did you upgrade from an earlier version? If so, can you try to 
clear your browser-cache? We had this 
[Thread|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201303.mbox/%3ccaeemfb210ehgc9v5cjgj6yrjrkdwg+9roqpevfk4jtaq4tk...@mail.gmail.com%3E]
 on the list two weeks ago and that solved the Problem

 core selector not working in Chrome
 ---

 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2
Reporter: Maciej Lizewski
 Attachments: Przechwytywanie.PNG


 after fresh installation of Solr 4.2 on windows 7 64bit
 I do not see any cores in Google Chrome to select in combobox. Also - when 
 trying to prepare URI by hand - I see error that there is no such core. In 
 FireFox - there is default 'collection1' core visible without problems.
 My Chrome version: 26.0.1410.28 beta-m
 I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4564) Admin UI fails to load properly on Chrome

2013-03-12 Thread Stefan Matheis (steffkes) (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600296#comment-13600296
 ] 

Stefan Matheis (steffkes) commented on SOLR-4564:
-

[~abakle] Did you upgrade from an earlier version? If so, can you try to clear 
your browser-cache? We had this 
[Thread|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201303.mbox/%3ccaeemfb210ehgc9v5cjgj6yrjrkdwg+9roqpevfk4jtaq4tk...@mail.gmail.com%3E]
 on the list two weeks ago and that solved the Problem

For the Schema-Browser: Would you mind opening another/separate Issue and 
include the Output of {{/solr/collection1/admin/luke?numTerms=0wt=json}} and 
{{/solr/collection1/admin/luke?show=schemawt=json}} as attachment? That would 
simplify the testing with a real-world configuration

 Admin UI fails to load properly on Chrome
 -

 Key: SOLR-4564
 URL: https://issues.apache.org/jira/browse/SOLR-4564
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.2
 Environment: Jboss 7.1.1 and Solr 4.2 
Reporter: Aditya

 Admin UI fails to load collection list on Chrome. The dropdown is empty. 
 Clicking on Logging and Threads throws javascript error in console. 
 GET http://10.124.55.84/solr/undefined/admin/logging?wt=jsonsince=0 404 (Not 
 Found) {require.js:10157}
 GET http://10.124.55.84/solr/undefined/admin/threads?wt=json 404 (Not Found) 
 require.js:10157
 Checked on IE9 and the UI looks good. but Schema browser is sluggish. whiel 
 searching fields. Every keystroke creates a pause for field look-up. We have 
 around 290 fields (including dynamic) defined in schema. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4562) core selector not working in Chrome

2013-03-12 Thread Stefan Matheis (steffkes) (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Matheis (steffkes) updated SOLR-4562:


Component/s: web gui

 core selector not working in Chrome
 ---

 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.2
Reporter: Maciej Lizewski
 Attachments: Przechwytywanie.PNG


 after fresh installation of Solr 4.2 on windows 7 64bit
 I do not see any cores in Google Chrome to select in combobox. Also - when 
 trying to prepare URI by hand - I see error that there is no such core. In 
 FireFox - there is default 'collection1' core visible without problems.
 My Chrome version: 26.0.1410.28 beta-m
 I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4465) Configurable Collectors

2013-03-12 Thread Greg Bowyer (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600324#comment-13600324
 ] 

Greg Bowyer commented on SOLR-4465:
---

Does the CollectorSpec serve the same purpose as say the GroupingSpecification, 
that is to provide underlying collectors (and the search in general) with the 
right requirements information.

I ask because maybe it would be easier to make the CollectorSpec support a map 
of String - Object or String - CollectorProperty

I am trying to think how we can do grouping with this.

 but I might have misinterpreted what its for

 Configurable Collectors
 ---

 Key: SOLR-4465
 URL: https://issues.apache.org/jira/browse/SOLR-4465
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 4.1
Reporter: Joel Bernstein
 Fix For: 4.3

 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, 
 SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, 
 SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch


 This issue is to add configurable custom collectors to Solr. This expands the 
 design and work done in issue SOLR-1680 to include:
 1) CollectorFactory configuration in solconfig.xml
 2) Http parameters to allow clients to dynamically select a CollectorFactory 
 and construct a custom Collector.
 3) Make aspects of QueryComponent pluggable so that the output from 
 distributed search can conform with custom collectors at the shard level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4824) Query time join returns different results based on the field type

Akos Kitta created LUCENE-4824:
--

 Summary: Query time join returns different results based on the 
field type
 Key: LUCENE-4824
 URL: https://issues.apache.org/jira/browse/LUCENE-4824
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/join
Affects Versions: 4.1
Reporter: Akos Kitta


I'm experiencing different query time joining behavior based on the type of the 
'toField' and 'fromField'. Basically I got correct results when both 'toField' 
and 'fromField' are StringField, but incorrect in case of LongField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4824) Query time join returns different results based on the field type


 [ 
https://issues.apache.org/jira/browse/LUCENE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akos Kitta updated LUCENE-4824:
---

Attachment: QueryTimeJoinTest.java

Attaching simple test case.

 Query time join returns different results based on the field type
 -

 Key: LUCENE-4824
 URL: https://issues.apache.org/jira/browse/LUCENE-4824
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/join
Affects Versions: 4.1
Reporter: Akos Kitta
  Labels: newbie
 Attachments: QueryTimeJoinTest.java


 I'm experiencing different query time joining behavior based on the type of 
 the 'toField' and 'fromField'. Basically I got correct results when both 
 'toField' and 'fromField' are StringField, but incorrect in case of LongField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4824) Query time join returns different results based on the field type


 [ 
https://issues.apache.org/jira/browse/LUCENE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akos Kitta updated LUCENE-4824:
---

Attachment: (was: QueryTimeJoinTest.java)

 Query time join returns different results based on the field type
 -

 Key: LUCENE-4824
 URL: https://issues.apache.org/jira/browse/LUCENE-4824
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/join
Affects Versions: 4.1
Reporter: Akos Kitta
  Labels: newbie
 Attachments: QueryTimeJoinTest.java


 I'm experiencing different query time joining behavior based on the type of 
 the 'toField' and 'fromField'. Basically I got correct results when both 
 'toField' and 'fromField' are StringField, but incorrect in case of LongField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4824) Query time join returns different results based on the field type


 [ 
https://issues.apache.org/jira/browse/LUCENE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akos Kitta updated LUCENE-4824:
---

Attachment: QueryTimeJoinTest.java

 Query time join returns different results based on the field type
 -

 Key: LUCENE-4824
 URL: https://issues.apache.org/jira/browse/LUCENE-4824
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/join
Affects Versions: 4.1
Reporter: Akos Kitta
  Labels: newbie
 Attachments: QueryTimeJoinTest.java


 I'm experiencing different query time joining behavior based on the type of 
 the 'toField' and 'fromField'. Basically I got correct results when both 
 'toField' and 'fromField' are StringField, but incorrect in case of LongField.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4562) core selector not working in Chrome


[ 
https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600352#comment-13600352
 ] 

Maciej Lizewski commented on SOLR-4562:
---

You were right. After clearing browser cache everything is working ok.
Sorry for duplicate issue - I search for something similar but did not found 
that one.

funny thing is that earlier I tried refreshing page with SHIFT which *should* 
reload all resources from server... :)

 core selector not working in Chrome
 ---

 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.2
Reporter: Maciej Lizewski
 Attachments: Przechwytywanie.PNG


 after fresh installation of Solr 4.2 on windows 7 64bit
 I do not see any cores in Google Chrome to select in combobox. Also - when 
 trying to prepare URI by hand - I see error that there is no such core. In 
 FireFox - there is default 'collection1' core visible without problems.
 My Chrome version: 26.0.1410.28 beta-m
 I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4562) core selector not working in Chrome


 [ 
https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Lizewski resolved SOLR-4562.
---

Resolution: Not A Problem

 core selector not working in Chrome
 ---

 Key: SOLR-4562
 URL: https://issues.apache.org/jira/browse/SOLR-4562
 Project: Solr
  Issue Type: Bug
  Components: web gui
Affects Versions: 4.2
Reporter: Maciej Lizewski
 Attachments: Przechwytywanie.PNG


 after fresh installation of Solr 4.2 on windows 7 64bit
 I do not see any cores in Google Chrome to select in combobox. Also - when 
 trying to prepare URI by hand - I see error that there is no such core. In 
 FireFox - there is default 'collection1' core visible without problems.
 My Chrome version: 26.0.1410.28 beta-m
 I cannot se any errors in JS console...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues


[ 
https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600355#comment-13600355
 ] 

Michael McCandless commented on LUCENE-4795:


{quote}
So rather perhaps we should:
  * Add a FacetField (extends SSDVF) which takes a CP (potentially 
FacetIndexingParams as well).
  * It will call super(CLP.DEFAULT_FIELD, new BytesRef(cp.toString())) (we can 
optimize that later, e.g. have CP expose a BytesRef API too if we want).
  * Potentially, allow (or not) to define the field type.
{quote}

I agree it's awkward now.

But ... FacetField makes me nervous, just because it's too close to
FacetFields and users may think they can mix  match the two
approaches.  It's trappy ... maybe SortedSetDocValuesFacetField
instead?

But you'd need to provide it with this separator... hmm, or maybe we
can use the same sep as FIP.

Separately, I wonder whether facet module should escape the delimiter
when it appears in a cat path label, in general (and, here)?  This way
the app does not have to ensure it never appears in any label (which I
think is tricky for some apps to do, eg a search server like
ElasticSearch/Solr can't do this).

bq. Any reason why you don't get a hold of the returned FRN? 

I wanted to keep it simple for starters ... but I'll fix to reuse the
rejected entry.


 Add FacetsCollector based on SortedSetDocValues
 ---

 Key: LUCENE-4795
 URL: https://issues.apache.org/jira/browse/LUCENE-4795
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Michael McCandless
Assignee: Michael McCandless
 Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, 
 LUCENE-4795.patch, pleaseBenchmarkMe.patch


 Recently (LUCENE-4765) we added multi-valued DocValues field
 (SortedSetDocValuesField), and this can be used for faceting in Solr
 (SOLR-4490).  I think we should also add support in the facet module?
 It'd be an option with different tradeoffs.  Eg, it wouldn't require
 the taxonomy index, since the main index handles label/ord resolving.
 There are at least two possible approaches:
   * On every reopen, build the seg - global ord map, and then on
 every collect, get the seg ord, map it to the global ord space,
 and increment counts.  This adds cost during reopen in proportion
 to number of unique terms ...
   * On every collect, increment counts based on the seg ords, and then
 do a merge in the end just like distributed faceting does.
 The first approach is much easier so I built a quick prototype using
 that.  The prototype does the counting, but it does NOT do the top K
 facets gathering in the end, and it doesn't know parent/child ord
 relationships, so there's tons more to do before this is real.  I also
 was unsure how to properly integrate it since the existing classes
 seem to expect that you use a taxonomy index to resolve ords.
 I ran a quick performance test.  base = trunk except I disabled the
 compute top-K in FacetsAccumulator to make the comparison fair; comp
 = using the prototype collector in the patch:
 {noformat}
 TaskQPS base  StdDevQPS comp  StdDev  
   Pct diff
OrHighLow   18.79  (2.5%)   14.36  (3.3%)  
 -23.6% ( -28% -  -18%)
 HighTerm   21.58  (2.4%)   16.53  (3.7%)  
 -23.4% ( -28% -  -17%)
OrHighMed   18.20  (2.5%)   13.99  (3.3%)  
 -23.2% ( -28% -  -17%)
  Prefix3   14.37  (1.5%)   11.62  (3.5%)  
 -19.1% ( -23% -  -14%)
  LowTerm  130.80  (1.6%)  106.95  (2.4%)  
 -18.2% ( -21% -  -14%)
   OrHighHigh9.60  (2.6%)7.88  (3.5%)  
 -17.9% ( -23% -  -12%)
  AndHighHigh   24.61  (0.7%)   20.74  (1.9%)  
 -15.7% ( -18% -  -13%)
   Fuzzy1   49.40  (2.5%)   43.48  (1.9%)  
 -12.0% ( -15% -   -7%)
  MedSloppyPhrase   27.06  (1.6%)   23.95  (2.3%)  
 -11.5% ( -15% -   -7%)
  MedTerm   51.43  (2.0%)   46.21  (2.7%)  
 -10.2% ( -14% -   -5%)
   IntNRQ4.02  (1.6%)3.63  (4.0%)   
 -9.7% ( -15% -   -4%)
 Wildcard   29.14  (1.5%)   26.46  (2.5%)   
 -9.2% ( -13% -   -5%)
 HighSloppyPhrase0.92  (4.5%)0.87  (5.8%)   
 -5.4% ( -15% -5%)
  MedSpanNear   29.51  (2.5%)   27.94  (2.2%)   
 -5.3% (  -9% -0%)
 HighSpanNear3.55  (2.4%)3.38  (2.0%)   
 -4.9% (  -9% -0%)
   AndHighMed  108.34  (0.9%)  104.55  (1.1%)

[jira] [Created] (LUCENE-4825) PostingsHighlighter support for positional queries

2013-03-12 Thread Luca Cavanna (JIRA)

Luca Cavanna created LUCENE-4825:


 Summary: PostingsHighlighter support for positional queries
 Key: LUCENE-4825
 URL: https://issues.apache.org/jira/browse/LUCENE-4825
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/highlighter
Affects Versions: 4.2
Reporter: Luca Cavanna


I've been playing around with the brand new PostingsHighlighter. I'm really 
happy with the result in terms of quality of the snippets and performance.
On the other hand, I noticed it doesn't support positional queries. If you make 
a span query, for example, all the single terms will be highlighted, even 
though they haven't contributed to the match. That reminds me of the difference 
between the QueryTermScorer and the QueryScorer (using the standard 
Highlighter).

I've been trying to adapt what the QueryScorer does, especially the extraction 
of the query terms together with their positions (what 
WeightedSpanTermExtractor does). Next step would be to take that information 
into account within the formatter and highlight only the terms that actually 
contributed to the match. I'm not quite ready yet with a patch to contribute 
this back, but I certainly intend to do so. That's why I opened the issue and 
in the meantime I would like to hear what you guys think about it and  discuss 
how best we can fix it. I think it would be a big improvement for this new 
highlighter, which is already great!



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages

Michael McCandless created LUCENE-4826:
--

 Summary: PostingsHighlighter doesn't keep the top N best scoring 
passages
 Key: LUCENE-4826
 URL: https://issues.apache.org/jira/browse/LUCENE-4826
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Reporter: Michael McCandless
 Fix For: 5.0, 4.3
 Attachments: LUCENE-4826.patch

The comparator we pass to the PQ is just backwards ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages


 [ 
https://issues.apache.org/jira/browse/LUCENE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-4826:
---

Attachment: LUCENE-4826.patch

 PostingsHighlighter doesn't keep the top N best scoring passages
 

 Key: LUCENE-4826
 URL: https://issues.apache.org/jira/browse/LUCENE-4826
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Reporter: Michael McCandless
 Fix For: 5.0, 4.3

 Attachments: LUCENE-4826.patch


 The comparator we pass to the PQ is just backwards ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4825) PostingsHighlighter support for positional queries

2013-03-12 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600404#comment-13600404
]

Robert Muir commented on LUCENE-4825:
-

I think it supports positional queries, just in a different way.

I don't really like the way the standardhighlighter does this myself. I would
prefer if we avoided the slow stuff
those things do in this highlighter (because we already have other ones that do
that). This one instead puts more effort
on trying to summarize the document with respect to the query terms (which is
faster, and for some cases, better use of cpu time).

I think a good improvement would be to letting the proximity of terms within
passages influence the scoring. Its not necessary to actually gather anything
about the query to do this and wouldnt be confusing and would still support all
queries that support extractTerms().

On the other hand we can always create variants of this highlighter that do as
you suggest, so that it leaves the user with more choices. But I just would
prefer we don't try to force PostingsHighlighter work like the other
highlighters for the reasons i mentioned.

PostingsHighlighter support for positional queries
--

Key: LUCENE-4825
URL: https://issues.apache.org/jira/browse/LUCENE-4825
Project: Lucene - Core
Issue Type: Improvement
Components: modules/highlighter
Affects Versions: 4.2
Reporter: Luca Cavanna

I've been playing around with the brand new PostingsHighlighter. I'm really
happy with the result in terms of quality of the snippets and performance.
On the other hand, I noticed it doesn't support positional queries. If you
make a span query, for example, all the single terms will be highlighted,
even though they haven't contributed to the match. That reminds me of the
difference between the QueryTermScorer and the QueryScorer (using the
standard Highlighter).
I've been trying to adapt what the QueryScorer does, especially the
extraction of the query terms together with their positions (what
WeightedSpanTermExtractor does). Next step would be to take that information
into account within the formatter and highlight only the terms that actually
contributed to the match. I'm not quite ready yet with a patch to contribute
this back, but I certainly intend to do so. That's why I opened the issue and
in the meantime I would like to hear what you guys think about it and
discuss how best we can fix it. I think it would be a big improvement for
this new highlighter, which is already great!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-4465) Configurable Collectors

2013-03-12 Thread Joel Bernstein (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Joel Bernstein updated SOLR-4465:
-

Attachment: SOLR-4465.patch

Added support for delegating collectors.

This design allows for a topdocs collector to be wrapped by delegating
collectors.

The topdocs collector collects the doclist and docset. The delegating
collectors are designed to collect aggregate data of some kind.

The delegating collectors have access to the ResponseBuilder and through that
can add Maps directly to the SolrQueryResponse.

Both the topdocs collector and the delegating collectors take part in the merge
of distributed results from shards.

This paves the way for pluggable distributed analytics to be included with
searches results.

TODO: I believe Maps that are placed in the SolrQueryResponse are automatically
output but some work needs to be done get them read in the solrj QueryResponse
class so they can be merged.

Configurable Collectors
---

Key: SOLR-4465
URL: https://issues.apache.org/jira/browse/SOLR-4465
Project: Solr
Issue Type: New Feature
Components: search
Affects Versions: 4.1
Reporter: Joel Bernstein
Fix For: 4.3

Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch,
SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch,
SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch,
SOLR-4465.patch

This issue is to add configurable custom collectors to Solr. This expands the
design and work done in issue SOLR-1680 to include:
1) CollectorFactory configuration in solconfig.xml
2) Http parameters to allow clients to dynamically select a CollectorFactory
and construct a custom Collector.
3) Make aspects of QueryComponent pluggable so that the output from
distributed search can conform with custom collectors at the shard level.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages

2013-03-12 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600411#comment-13600411
 ] 

Robert Muir commented on LUCENE-4826:
-

+1!

Here is a smaller test: in order to trick it to fail, you must have something 
like
Great Sentence. Crappy Sentence. Good Sentence.

otherwise they never make it into the PQ to demonstrate the bug...

{code}
  public void testPassageRanking() throws Exception {
Directory dir = newDirectory();
IndexWriterConfig iwc = newIndexWriterConfig(TEST_VERSION_CURRENT, new 
MockAnalyzer(random(), MockTokenizer.SIMPLE, true));
iwc.setMergePolicy(newLogMergePolicy());
RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwc);

FieldType offsetsType = new FieldType(TextField.TYPE_STORED);

offsetsType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS);
Field body = new Field(body, , offsetsType);
Document doc = new Document();
doc.add(body);

body.setStringValue(This is a test.  Just highlighting from postings. This 
is also a much sillier test.  Feel free to test test test test test test 
test.);
iw.addDocument(doc);

IndexReader ir = iw.getReader();
iw.close();

IndexSearcher searcher = newSearcher(ir);
PostingsHighlighter highlighter = new PostingsHighlighter();
Query query = new TermQuery(new Term(body, test));
TopDocs topDocs = searcher.search(query, null, 10, Sort.INDEXORDER);
assertEquals(1, topDocs.totalHits);
String snippets[] = highlighter.highlight(body, query, searcher, topDocs, 
2);
assertEquals(1, snippets.length);
assertEquals(This is a btest/b.  ... Feel free to btest/b 
btest/b btest/b btest/b btest/b btest/b btest/b., 
snippets[0]);

ir.close();
dir.close();
  }
{code}

 PostingsHighlighter doesn't keep the top N best scoring passages
 

 Key: LUCENE-4826
 URL: https://issues.apache.org/jira/browse/LUCENE-4826
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/highlighter
Reporter: Michael McCandless
 Fix For: 5.0, 4.3

 Attachments: LUCENE-4826.patch


 The comparator we pass to the PQ is just backwards ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-4565) Extend NorwegianMinimalStemFilter to handle nynorsk