Re: initVM segmentation fault
Hi Andi, Thanks for getting back. I got it resolved already by just compiling from source the files from the download page. The JCC and pylucene on the box were installed separately from Fedora16 repositories and it seems like they are broken somewhat. Jeune On Mar 12, 2013 7:07 PM, Andi Vajda va...@apache.org wrote: On Mar 12, 2013, at 2:51, Jeune Asuncion je...@bright.com wrote: Dear *Pylucene*, I am trying to run pylucene on my Fedora box but I get a segmentation fault when I do so. I was able to trace the cause of this error to initVM(). In the Python interpreter when I execute the lines of code below I get the segmentation fault: import lucene lucene.initVM() Segmentation fault I thought this was because jcc isn't installed because I have pylucene installed on another box and it returns a jcc object. However, I have jcc installed as well on the box where lucene.initVM() isn't working: import jcc jcc.initVM() jcc.JCCEnv object at 0x7f7162e12138 Would like to get some pointers as to why this is happening. Did you build PyLucene and JCC on this box ? Andi.. Thanks, Jeune
Re: initVM segmentation fault
On Mar 12, 2013, at 12:10, Jeune Asuncion je...@bright.com wrote: Hi Andi, Thanks for getting back. I got it resolved already by just compiling from source the files from the download page. The JCC and pylucene on the box were installed separately from Fedora16 repositories and it seems like they are broken somewhat. Excellent ! Andi.. Jeune On Mar 12, 2013 7:07 PM, Andi Vajda va...@apache.org wrote: On Mar 12, 2013, at 2:51, Jeune Asuncion je...@bright.com wrote: Dear *Pylucene*, I am trying to run pylucene on my Fedora box but I get a segmentation fault when I do so. I was able to trace the cause of this error to initVM(). In the Python interpreter when I execute the lines of code below I get the segmentation fault: import lucene lucene.initVM() Segmentation fault I thought this was because jcc isn't installed because I have pylucene installed on another box and it returns a jcc object. However, I have jcc installed as well on the box where lucene.initVM() isn't working: import jcc jcc.initVM() jcc.JCCEnv object at 0x7f7162e12138 Would like to get some pointers as to why this is happening. Did you build PyLucene and JCC on this box ? Andi.. Thanks, Jeune
[jira] [Updated] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken
[ https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheer Prem updated SOLR-4561: --- Description: When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... was: When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} CachedSqlEntityProcessor with parametarized query is broken --- Key: SOLR-4561 URL: https://issues.apache.org/jira/browse/SOLR-4561 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.1 Reporter: Sudheer Prem Original Estimate: 1m Remaining Estimate: 1m When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599779#comment-13599779 ] Shai Erera commented on LUCENE-4795: bq. Well the taxonomy index doesn't give you global ordinals. it gives you global termIDs, which are unique integers: but they aren't ordinals That's right. I am not familiar with how Solr utilizes that, but I agree with your statement. The term ordinal was derived from the fact that the taxonomy does preserve order between parent/children. I.e. Date Date/2010 Date/2011. So Date will always have a lower ordinal than its children, but there is not meaningful order between siblings. bq. Its also unclear to me how the taxonomy index would really integrate in a distributed system like solr or elasticsearch. Why? We work with the taxonomy index in two modes in a distributed environment: # Every shard maintains its own taxonomy index and facets are merged by their label. That's basically what Solr/ES/SortedSet would do right? # In a specific project we run, where every document goes through a MapReduce analysis (no NRT!), we maintain a truly global taxonomy index, where ordinal=17 means the same category in all shards. The taxonomy index itself is replicated to all shards. There are tradeoffs of course, but you cannot do that with SortedSet right? The advantage is that you can do the merge by the ordinal (integer ID), rather than the label. bq. I personally don't think its the end of the world if Mike's patch doesnt support all the features of the faceting module initially or even ever. +1, I don't criticize that approach negatively. I personally don't understand why the sidecar taxonomy index freaks the hell out of people, but I don't mind if there are multiple facet implementations. I can share with you that we used to have few implementations too, before we converged to one (and then contributed to Lucene). You didn't answer my question though, and perhaps it doesn't belong in this issue, but is there a way to utilize the ordinal given to a DV value somehow? Or is it internal to the SortedSet DV? Mike, should you also check in SortedSetDocValuesAccumulator that FR.getDepth() == 1? I don't think that you support counting up to depth N, right? Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh
[jira] [Updated] (SOLR-3755) shard splitting
[ https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anshum Gupta updated SOLR-3755: --- Attachment: SOLR-3755-combinedWithReplication.patch Added replica creation to the earlier 'combined' patch that Shalin had put up. This is yet to be tested as we're yet to fix the 2nd core creation issue. shard splitting --- Key: SOLR-3755 URL: https://issues.apache.org/jira/browse/SOLR-3755 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Yonik Seeley Attachments: SOLR-3755-combined.patch, SOLR-3755-combinedWithReplication.patch, SOLR-3755-CoreAdmin.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755-testSplitter.patch, SOLR-3755-testSplitter.patch We can currently easily add replicas to handle increases in query volume, but we should also add a way to add additional shards dynamically by splitting existing shards. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit)
Uwe you rock! Beside my morning entertainment this was an awesome job! simon On Tue, Mar 12, 2013 at 8:31 AM, Tommaso Teofili tommaso.teof...@gmail.com wrote: thanks Uwe! 2013/3/12 Robert Muir rcm...@gmail.com Uwe: Thanks for working with them to get all these issues fixed. On Mon, Mar 11, 2013 at 7:34 PM, Uwe Schindler u...@thetaphi.de wrote: Hi, FYI, Oracle has a fix for the G1GC hang in UIMA waiting for review: Issue: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8009536 Webrev: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013-March/006215.html Patch: http://cr.openjdk.java.net/~johnc/8009536/webrev.0/ Thanks to John Cuthbertson and Bengt Rutisson @ Oracle for fixing so fast! We just have to wait for a new JDK8 build with that fix included (and some more for the other Lucene-related bugs). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Wednesday, March 06, 2013 7:52 PM To: dev@lucene.apache.org Subject: Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit) Awesome work Uwe! Nice job getting this some attention. - mark On Mar 6, 2013, at 10:41 AM, Uwe Schindler u...@thetaphi.de wrote: It seems that there is already an explanation from the Oracle engineer: -Original Message- From: John Cuthbertson [mailto:john.cuthbert...@oracle.com] Sent: Wednesday, March 06, 2013 7:04 PM To: Thomas Schatzl Cc: Uwe Schindler; hotspot-gc-...@openjdk.java.net; 'David Holmes'; 'Dawid Weiss'; hotspot-...@openjdk.java.net Subject: Re: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit) Hi Everyone, All: I've looked at the bug report (haven't tried to reproduce it yet) and Bengt's analysis is correct. The concurrent mark thread is entering the synchronization protocol in a marking step call. That code is waiting for some non-existent workers to terminate before proceeding. Normally we shouldn't be entering that code but I think we overflowed the global marking stack (I updated the CR at ~1am my time with that conjecture). I think I missed a set_phase() call to tell the parallel terminator that we only have one thread and it's picking up the number of workers that executed the remark parallel task. Thomas: you were on the right track with your comment about the marking stack size. David: Thanks for helping out here. The stack trace you mentioned was for one the refinement threads - a concurrent GC thread. When a concurrent GC thread joins the suspendible thread set, it means that it will observe and participate in safepoint operations, i.e. the thread will notice that it should reach a safepoint and the safepoint synchronizer code will wait for it to block. When we wish a concurrent GC thread to not observe safepoints, that thread leaves the suspendible thread set. I think the name could be a bit better and Tony, before he left, had a change that used a scoped object to join and leave the STS that hasn't been integrated yet. IIRC Tony wasn't happy with the name he chose for that also. Uwe: Thanks for bringing this up and my apologies for not replying sooner. I will have a fix fairly soon. If I'm correct about it being caused by overflowing the marking stack you can work around the issue by increasing the MarkStackSize.you could try increasing it to 2M or 4M entries (which is the current max size). Cheers, JohnC - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Wednesday, March 06, 2013 1:35 PM To: dev@lucene.apache.org Subject: FW: JVM hanging when using G1GC on JDK8 b78 or b79 (Linux 32 bit) They already understood the G1GC problem with JDK 8 b78/b79 and working on a fix. This was really fast: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2013- March/006128.html Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599837#comment-13599837 ] Uwe Schindler commented on LUCENE-4713: --- Use Codecs.reloadCodecs(antClassLoader) in your application initialization code. The same method exists for PostingsFormats, Unfortunately there is no method to automatically reload all SPIs in Lucene. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Attachments: LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599842#comment-13599842 ] Uwe Schindler commented on LUCENE-4713: --- From my persepctive, Christian Kohlschütters suggestion is nice to have. We should at least enforce that the classloader that loaded the lucene-core.jar file is also scanned, regardless what the context class loader is - this would somehow emulate what the JDK does wth its own extensions like XML parsers. In any case, we would need to decide, what to do first (the Lucene class loader or the context one). I will provide a patch. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Attachments: LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Attachment: LUCENE-4713.patch This is the easiest patch possible. Still lacks some documentation (to actually document that the Lucene class loader is scanned), but ensures that at least all SPIs shipped with Lucene are visible. If a user has additional SPIs outside Lucene core, then its his turn to make them correctly available. The Lucene classloader is scanned before the core one, because the classes shipped with lucene should take precedence. On the other hand, this makes it impossible to override Lucene's default codec unless you place the jar file next to lucene-core.jar in same classloader. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Attachment: LUCENE-4713.patch SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599858#comment-13599858 ] Uwe Schindler edited comment on LUCENE-4713 at 3/12/13 9:30 AM: This is the easiest patch possible. Still lacks some documentation (to actually document that the Lucene class loader is scanned), but ensures that at least all SPIs shipped with Lucene are visible. If a user has additional SPIs outside Lucene core, then its his turn to make them correctly available. The Lucene classloader is scanned before the context one, because the classes shipped with lucene should take precedence. On the other hand, this makes it impossible to override Lucene's default codec unless you place the jar file next to lucene-core.jar in same classloader. was (Author: thetaphi): This is the easiest patch possible. Still lacks some documentation (to actually document that the Lucene class loader is scanned), but ensures that at least all SPIs shipped with Lucene are visible. If a user has additional SPIs outside Lucene core, then its his turn to make them correctly available. The Lucene classloader is scanned before the core one, because the classes shipped with lucene should take precedence. On the other hand, this makes it impossible to override Lucene's default codec unless you place the jar file next to lucene-core.jar in same classloader. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Attachment: (was: LUCENE-4713.patch) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Fix Version/s: 4.3 SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken
[ https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599868#comment-13599868 ] Ahmet Arslan commented on SOLR-4561: I ran into this bug too. CachedSqlEntityProcessor with parametarized query is broken --- Key: SOLR-4561 URL: https://issues.apache.org/jira/browse/SOLR-4561 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.1 Reporter: Sudheer Prem Original Estimate: 1m Remaining Estimate: 1m When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken
[ https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599872#comment-13599872 ] Ahmet Arslan commented on SOLR-4561: It seems that it was reported before by James in SOLR-3857 CachedSqlEntityProcessor with parametarized query is broken --- Key: SOLR-4561 URL: https://issues.apache.org/jira/browse/SOLR-4561 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.1 Reporter: Sudheer Prem Original Estimate: 1m Remaining Estimate: 1m When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599885#comment-13599885 ] Christian Kohlschütter commented on LUCENE-4713: Thanks, Uwe! Looks good and works well in our setup. Regarding overriding Lucene's default codec implementations: We anyways have to place any other modified, non-SPI Lucene classes in the same ClassLoader, so I really appreciate that this patch enforces this. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Attachment: LUCENE-4713.patch Hi Christian, another patch, with some optimization. The clazz's classloader is only scanned, if its not a parent or the same. If the Lucene's clazz' classloader is a parent of the context one, it does not need to scan it. This also works around the problems with hiding classes. To override the Lucene core codecs, e.g. Tomcat's classloader (J2EE) will use parent-last semantics, and in that case the precedence goes to the webapp. Only if the lucene classloader is not at all related to the context one, it is scanned. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599891#comment-13599891 ] Uwe Schindler edited comment on LUCENE-4713 at 3/12/13 10:22 AM: - Hi Christian, another patch, with some optimization. The clazz's classloader is only scanned, if its not a parent or the same. If the Lucene's clazz' classloader is a parent of the context one, it does not need to scan it. This also works around the problems with hiding classes. To override the Lucene core codecs, e.g. Tomcat's classloader (J2EE) will use parent-last semantics, and in that case the precedence goes to the webapp. Only if the lucene classloader is not at all related to the context one, it is scanned. Can you try this, too? Unfortunately its hard to write a good testcase without some fake classes in separate compilation units which complicates the Lucene build :-) was (Author: thetaphi): Hi Christian, another patch, with some optimization. The clazz's classloader is only scanned, if its not a parent or the same. If the Lucene's clazz' classloader is a parent of the context one, it does not need to scan it. This also works around the problems with hiding classes. To override the Lucene core codecs, e.g. Tomcat's classloader (J2EE) will use parent-last semantics, and in that case the precedence goes to the webapp. Only if the lucene classloader is not at all related to the context one, it is scanned. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599912#comment-13599912 ] Uwe Schindler commented on LUCENE-4713: --- bq. Regarding overriding Lucene's default codec implementations: We anyways have to place any other modified, non-SPI Lucene classes in the same ClassLoader, so I really appreciate that this patch enforces this. Overriding default Lucene Codecs doesn't need to necessarily use the same class name. Codecs are identified by their name as written into the index files (e.g., Lucene42). If you implement another subclass of Codec with the same name, but different class name, it is also taken into account. But in any case, the class file must be listed before the lucene-core.jar one in classpath (btw, this is used in Lucene 4.x, to allow a READ/WRITE variant of the Lucene3x codec for testing only. The test-framework.jar simply exposes another class, extending the original READONLY Lucene3x codec to support WRITE, but makeing it available also with the Lucene3x name to the loader). SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4412) LanguageIdentifier lcmap for language field
[ https://issues.apache.org/jira/browse/SOLR-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-4412: -- Attachment: SOLR-4412.patch First patch (git diff format) LanguageIdentifier lcmap for language field --- Key: SOLR-4412 URL: https://issues.apache.org/jira/browse/SOLR-4412 Project: Solr Issue Type: Bug Components: contrib - LangId Affects Versions: 4.1 Reporter: Jan Høydahl Fix For: 4.3 Attachments: SOLR-4412.patch For some languages, the detector will detect sub-languages, such as LangDetect detecting zh-tw or zh-cn for Chinese. Tika detector only detects zh. Today you can use {{lcmap}} to map these two into one code, e.g. {{langid.map.lcmap=zh-cn:zh zh-tw:zh}}. But the {{langField}} output is not changed. We need an option for {{langField}} as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-4412) LanguageIdentifier lcmap for language field
[ https://issues.apache.org/jira/browse/SOLR-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl reassigned SOLR-4412: - Assignee: Jan Høydahl LanguageIdentifier lcmap for language field --- Key: SOLR-4412 URL: https://issues.apache.org/jira/browse/SOLR-4412 Project: Solr Issue Type: Bug Components: contrib - LangId Affects Versions: 4.1 Reporter: Jan Høydahl Assignee: Jan Høydahl Fix For: 4.3 Attachments: SOLR-4412.patch For some languages, the detector will detect sub-languages, such as LangDetect detecting zh-tw or zh-cn for Chinese. Tika detector only detects zh. Today you can use {{lcmap}} to map these two into one code, e.g. {{langid.map.lcmap=zh-cn:zh zh-tw:zh}}. But the {{langField}} output is not changed. We need an option for {{langField}} as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599920#comment-13599920 ] Christian Kohlschütter commented on LUCENE-4713: Works for me, too. Those corner cases... One thing that I stumbled upon was that Thread#getContextClassLoader may actually return null. We currently throw an IllegalArgumentException in this case, which can be considered a bug by itself. If we decide that a fix for this bug is to check for null and use the classes' default ClassLoader instead, we would actually call #reload twice (because isParentClassLoader will return false if child==null). See the attached patch for a proposed fix. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-4713: - Assignee: Uwe Schindler SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Kohlschütter updated LUCENE-4713: --- Attachment: LUCENE-4713.patch SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: initVM segmentation fault
On Mar 12, 2013, at 2:51, Jeune Asuncion je...@bright.com wrote: Dear *Pylucene*, I am trying to run pylucene on my Fedora box but I get a segmentation fault when I do so. I was able to trace the cause of this error to initVM(). In the Python interpreter when I execute the lines of code below I get the segmentation fault: import lucene lucene.initVM() Segmentation fault I thought this was because jcc isn't installed because I have pylucene installed on another box and it returns a jcc object. However, I have jcc installed as well on the box where lucene.initVM() isn't working: import jcc jcc.initVM() jcc.JCCEnv object at 0x7f7162e12138 Would like to get some pointers as to why this is happening. Did you build PyLucene and JCC on this box ? Andi.. Thanks, Jeune
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Kohlschütter updated LUCENE-4713: --- Attachment: LUCENE-4713.patch SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Kohlschütter updated LUCENE-4713: --- Attachment: (was: LUCENE-4713.patch) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599927#comment-13599927 ] Uwe Schindler commented on LUCENE-4713: --- There is another problem: The abstract clazz' classloader may be null, too (although this never happens in recent JDKs): The bootstrap class loader may be null. But we don't have the problem here, as Lucene classes are never ever loaded through the boot class loader (but e.g. String.class.getClassLoader() may return null). I dont like hooking also into reload(), I will think of another more elegant solution). But to mention: If the context class loader is null (which cannot happen unless you explicitly set it to null), Java's own classloading for SPIs ould be broken, too (see the implementation of java.util.ServiceLoader). SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Kohlschütter updated LUCENE-4713: --- Attachment: LUCENE-4713.patch This patch keeps #reload untouched. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Attachment: LUCENE-4713.patch Here is the patch that mimics what the original java.util.ServiceLoader does: If the classloader (e.g. the context classloader) is null, it uses the system classloader. The exception on null classloader was removed. The patch then also adds some null checks, so the fallback case is only used if both possible loaders are != null. If all class loaders are null, the system loader is used, which should never happen, as Lucene is not part of rt.jar. I think this is ready. Unfortunately we had some overlap, Christian :-) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599927#comment-13599927 ] Uwe Schindler edited comment on LUCENE-4713 at 3/12/13 11:44 AM: - There is another problem: The abstract clazz' classloader may be null, too (although this never happens in recent JDKs): The bootstrap class loader may be null. But we don't have the problem here, as Lucene classes are never ever loaded through the boot class loader (but e.g. String.class.getClassLoader() may return null). I dont like hooking also into reload(), I will think of another more elegant solution). -But to mention: If the context class loader is null (which cannot happen unless you explicitly set it to null), Java's own classloading for SPIs would be broken, too (see the implementation of java.util.ServiceLoader).- (EDIT: Java's ServiceLoader uses SystemClassLoader if context loader is null) was (Author: thetaphi): There is another problem: The abstract clazz' classloader may be null, too (although this never happens in recent JDKs): The bootstrap class loader may be null. But we don't have the problem here, as Lucene classes are never ever loaded through the boot class loader (but e.g. String.class.getClassLoader() may return null). I dont like hooking also into reload(), I will think of another more elegant solution). But to mention: If the context class loader is null (which cannot happen unless you explicitly set it to null), Java's own classloading for SPIs ould be broken, too (see the implementation of java.util.ServiceLoader). SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599945#comment-13599945 ] Michael McCandless commented on LUCENE-4795: bq. Mike, why do you need to initialize a FacetRequest like so: requests.add(new CountFacetRequest(new CategoryPath(a, sep), 10));? Woops, that's just silly: I'll remove the sep there. bq. Mike, should you also check in SortedSetDocValuesAccumulator that FR.getDepth() == 1? I don't think that you support counting up to depth N, right? Right, it only supports flat (dim / label) today ... ok, I'll add that check. Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh 24.61 (0.7%) 20.74 (1.9%) -15.7% ( -18% - -13%) Fuzzy1 49.40 (2.5%) 43.48 (1.9%) -12.0% ( -15% - -7%) MedSloppyPhrase 27.06 (1.6%) 23.95 (2.3%) -11.5% ( -15% - -7%) MedTerm 51.43 (2.0%) 46.21 (2.7%) -10.2% ( -14% - -5%) IntNRQ4.02 (1.6%)3.63 (4.0%) -9.7% ( -15% - -4%) Wildcard 29.14 (1.5%) 26.46 (2.5%) -9.2% ( -13% - -5%) HighSloppyPhrase0.92 (4.5%)0.87 (5.8%) -5.4% ( -15% -5%) MedSpanNear 29.51 (2.5%) 27.94 (2.2%) -5.3% ( -9% -0%) HighSpanNear3.55 (2.4%)3.38 (2.0%) -4.9% ( -9% -0%) AndHighMed 108.34 (0.9%) 104.55 (1.1%) -3.5% ( -5% - -1%) LowSloppyPhrase 20.50 (2.0%) 20.09 (4.2%) -2.0% ( -8% -4%) LowPhrase 21.60 (6.0%) 21.26 (5.1%) -1.6% ( -11% - 10%) Fuzzy2 53.16 (3.9%) 52.40 (2.7%) -1.4% ( -7% -5%) LowSpanNear8.42 (3.2%)8.45 (3.0%) 0.3% ( -5% -6%) Respell 45.17 (4.3%) 45.38 (4.4%) 0.5% ( -7% -9%) MedPhrase 113.93 (5.8%) 115.02 (4.9%) 1.0% ( -9% - 12%) AndHighLow 596.42 (2.5%) 617.12 (2.8%) 3.5% ( -1% -8%)
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Attachment: LUCENE-4713.patch Sorry, again a new patch. Now the case where the context class loader is null is handled correctly. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Improving DirectSpellChecker
On Tue, Mar 12, 2013 at 7:22 AM, Varun Thacker varunthacker1...@gmail.com wrote: I was looking at the results from the spellchecker. So If I have a field where the terms get analyzed the results shown are the analyzed form as a suggestion. Example, for Battery the spell suggestion if one makes a mistake would be batteri. I don't think you should use such a field for spellchecking, instead just something very simple like standardtokenizer + lowercase for the spellcheck field. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599965#comment-13599965 ] Uwe Schindler commented on LUCENE-4713: --- Just for reference: see line 336+ of http://www.docjar.com/html/api/java/util/ServiceLoader.java.html SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4823) Add a separate registration singleton for Lucene's SPI, so there is only one central instance to request rescanning of classpath (e.g. from Solr's ResourceLoader)
Uwe Schindler created LUCENE-4823: - Summary: Add a separate registration singleton for Lucene's SPI, so there is only one central instance to request rescanning of classpath (e.g. from Solr's ResourceLoader) Key: LUCENE-4823 URL: https://issues.apache.org/jira/browse/LUCENE-4823 Project: Lucene - Core Issue Type: Bug Components: core/other Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, 4.3 Currently there is no easy way to do a global rescan/reload of all of Lucene's SPIs in the right order. In solr there is a long list of reload instructions in the ResourceLoader. If somebody adds a new SPI type, you have to add it there. It would be good to java a central instance in oal.util that keeps track of all NamedSPILoaders and AnalysisSPILoaders (in the order they were instantiated), so you have one central entry point to trigger a reload. This issue will introduce: - A singleton that makes reloading possible. The singleton keeps weak refs to all loaders (of any kind) in the order they were created. - NamedSPILoader and AnalysisSPILoader (unfortunately we need both instances, as they differ in the internals (one keeps classes, the other one instances). Both should implement a reloadable interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968 ] Christian Kohlschütter commented on LUCENE-4713: Overlap and coverage, Uwe :) One thing is still unclear to me. Given loader is null, in SPIClassIterator line 143 we call Class.forName with a null ClassLoader. However (at least in the Oracle 1.7 JDK) Class#forName(String,boolean,ClassLoader) does not use ClassLoader#getSystemClassLoader but ClassLoader#getCallerClassLoader instead (which IMHO contradicts the JavaDocs description, where they claim to use the bootstrap classloader...) Given that it is very unlikely that we're running into any problems with bootstrap resources, I would actually just check for loader==null in SPIClassIterator and assign loader=ClassLoader.getSystemClassLoader() in this case. This will use the System ClassLoader by default and only falls back to getCallerClassLoader if there is no System ClassLoader. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4713: -- Attachment: (was: LUCENE-4713.patch) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968 ] Christian Kohlschütter edited comment on LUCENE-4713 at 3/12/13 12:13 PM: -- Overlap and coverage, Uwe :) was (Author: c...@newsclub.de): Overlap and coverage, Uwe :) One thing is still unclear to me. Given loader is null, in SPIClassIterator line 143 we call Class.forName with a null ClassLoader. However (at least in the Oracle 1.7 JDK) Class#forName(String,boolean,ClassLoader) does not use ClassLoader#getSystemClassLoader but ClassLoader#getCallerClassLoader instead (which IMHO contradicts the JavaDocs description, where they claim to use the bootstrap classloader...) Given that it is very unlikely that we're running into any problems with bootstrap resources, I would actually just check for loader==null in SPIClassIterator and assign loader=ClassLoader.getSystemClassLoader() in this case. This will use the System ClassLoader by default and only falls back to getCallerClassLoader if there is no System ClassLoader. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968 ] Christian Kohlschütter edited comment on LUCENE-4713 at 3/12/13 12:13 PM: -- Overlap and coverage, Uwe :) Looks good to me! was (Author: c...@newsclub.de): Overlap and coverage, Uwe :) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4563) RSS DIH-example not working
[ https://issues.apache.org/jira/browse/SOLR-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-4563: -- Attachment: SOLR-4563.patch Simple patch RSS DIH-example not working --- Key: SOLR-4563 URL: https://issues.apache.org/jira/browse/SOLR-4563 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Jan Høydahl Fix For: 4.3, 5.0 Attachments: SOLR-4563.patch The xpath paths of /rss/item do not match the real world RSS feed which uses /rss/channel/item -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4823) Add a separate registration singleton for Lucene's SPI, so there is only one central instance to request rescanning of classpath (e.g. from Solr's ResourceLoader)
[ https://issues.apache.org/jira/browse/LUCENE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4823: -- Description: Currently there is no easy way to do a global rescan/reload of all of Lucene's SPIs in the right order. In solr there is a long list of reload instructions in the ResourceLoader. If somebody adds a new SPI type, you have to add it there. It would be good to java a central instance in oal.util that keeps track of all NamedSPILoaders and AnalysisSPILoaders (in the order they were instantiated), so you have one central entry point to trigger a reload. This issue will introduce: - A singleton that makes reloading possible. The singleton keeps weak refs to all loaders (of any kind) in the order they were created. - NamedSPILoader and AnalysisSPILoader cleanup (unfortunately we need both instances, as they differ in the internals (one keeps classes, the other one instances). Both should implement a reloadable interface. was: Currently there is no easy way to do a global rescan/reload of all of Lucene's SPIs in the right order. In solr there is a long list of reload instructions in the ResourceLoader. If somebody adds a new SPI type, you have to add it there. It would be good to java a central instance in oal.util that keeps track of all NamedSPILoaders and AnalysisSPILoaders (in the order they were instantiated), so you have one central entry point to trigger a reload. This issue will introduce: - A singleton that makes reloading possible. The singleton keeps weak refs to all loaders (of any kind) in the order they were created. - NamedSPILoader and AnalysisSPILoader (unfortunately we need both instances, as they differ in the internals (one keeps classes, the other one instances). Both should implement a reloadable interface. Add a separate registration singleton for Lucene's SPI, so there is only one central instance to request rescanning of classpath (e.g. from Solr's ResourceLoader) Key: LUCENE-4823 URL: https://issues.apache.org/jira/browse/LUCENE-4823 Project: Lucene - Core Issue Type: Bug Components: core/other Reporter: Uwe Schindler Assignee: Uwe Schindler Fix For: 5.0, 4.3 Currently there is no easy way to do a global rescan/reload of all of Lucene's SPIs in the right order. In solr there is a long list of reload instructions in the ResourceLoader. If somebody adds a new SPI type, you have to add it there. It would be good to java a central instance in oal.util that keeps track of all NamedSPILoaders and AnalysisSPILoaders (in the order they were instantiated), so you have one central entry point to trigger a reload. This issue will introduce: - A singleton that makes reloading possible. The singleton keeps weak refs to all loaders (of any kind) in the order they were created. - NamedSPILoader and AnalysisSPILoader cleanup (unfortunately we need both instances, as they differ in the internals (one keeps classes, the other one instances). Both should implement a reloadable interface. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4642) Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource
[ https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-4642: --- Summary: Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource (was: TokenizerFactory should provide a create method with a given AttributeSource) Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource --- Key: LUCENE-4642 URL: https://issues.apache.org/jira/browse/LUCENE-4642 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.1 Reporter: Renaud Delbru Assignee: Steve Rowe Labels: analysis, attribute, tokenizer Fix For: 4.3 Attachments: LUCENE-4642.patch, LUCENE-4642.patch, LUCENE-4642.patch, TrieTokenizerFactory.java.patch All tokenizer implementations have a constructor that takes a given AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory does not provide an API to create tokenizers with a given AttributeSource. Side note: There are still a lot of tokenizers that do not provide constructors that take AttributeSource and AttributeFactory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2595) Split and migrate indexes
[ https://issues.apache.org/jira/browse/SOLR-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-2595. - Resolution: Duplicate Split and migrate indexes - Key: SOLR-2595 URL: https://issues.apache.org/jira/browse/SOLR-2595 Project: Solr Issue Type: New Feature Components: multicore, replication (java), SolrCloud Reporter: Shalin Shekhar Mangar Fix For: 4.3 When an shard's index grows too large or a shard becomes too loaded, it should be possible to split parts of a shard's index and migrate/merge to a less loaded node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2593) A new core admin action 'split' for splitting index
[ https://issues.apache.org/jira/browse/SOLR-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-2593. - Resolution: Duplicate Committed as part of SOLR-3755 changes. A new core admin action 'split' for splitting index --- Key: SOLR-2593 URL: https://issues.apache.org/jira/browse/SOLR-2593 Project: Solr Issue Type: New Feature Reporter: Noble Paul Fix For: 4.3 If an index is too large/hot it would be desirable to split it out to another core . This core may eventually be replicated out to another host. There can be to be multiple strategies * random split of x or x% * fq=user:johndoe example : action=splitsplit=20percentnewcore=my_new_index or action=splitfq=user:johndoenewcore=john_doe_index -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599968#comment-13599968 ] Christian Kohlschütter edited comment on LUCENE-4713 at 3/12/13 12:16 PM: -- Overlap and coverage, Uwe :) Looks good to me! Nit: What you could do to be 100% safe that we're using the correct ClassLoader is to check for loader==null in SPIClassIterator and assign it to ClassLoader.getSystemClassLoader() in this case. was (Author: c...@newsclub.de): Overlap and coverage, Uwe :) Looks good to me! SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4642) Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource
[ https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-4642: --- Attachment: LUCENE-4642.patch Patch, narrows one or two more create(AttributeFactory) return types, minor cosmetic mods, removed unused imports. Committing shortly. Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource --- Key: LUCENE-4642 URL: https://issues.apache.org/jira/browse/LUCENE-4642 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.1 Reporter: Renaud Delbru Assignee: Steve Rowe Labels: analysis, attribute, tokenizer Fix For: 4.3 Attachments: LUCENE-4642.patch, LUCENE-4642.patch, LUCENE-4642.patch, LUCENE-4642.patch, TrieTokenizerFactory.java.patch All tokenizer implementations have a constructor that takes a given AttributeSource as parameter (LUCENE-1826). These should be removed. TokenizerFactory does not provide an API to create tokenizers with a given AttributeFactory, but quite a few tokenizers have constructors that take an AttributeFactory. TokenizerFactory should add a create(AttributeFactory) method, as should subclasses for tokenizers with AttributeFactory accepting ctors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken
[ https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600078#comment-13600078 ] Sudheer Prem edited comment on SOLR-4561 at 3/12/13 2:54 PM: - I have a scenario where table A contain 5 million rows and table B contain more than a million rows. The join condition matches for only a couple of thousands of records. I had been using this feature in earlier version of Solr. Suddenly due to this change, it took the wrong join (one which matches the first condition) and populate that value to all documents. After debugging, my thought for the fix is like this: This is happening because, in the method SqlEntityProcessor.nextRow(), the query is initialized and loaded only if the the rowIterator is null. Actually, the query should be initialized if the query is different than the previous query. If the logic is changed in that way, i think this issue will be fixed. To apply this logic, change the SqlEntityProcessor.nextRow() method from {code} if (rowIterator == null) { String q = getQuery(); initQuery(context.replaceTokens(q)); } {code} to the code mentioned below: {code} String q = context.replaceTokens(getQuery()); if(!q.equals(this.query)){ initQuery(q); } {code} Initial testing shows that, it seems working as expected. was (Author: sudheerprem): I have a scenario where table A contain 5 million rows and table B contain more than a million rows. The join condition matches for only a couple of thousands of records. I had been using this feature in earlier version of Solr. Suddenly due to this change, it took the wrong join (one which matches the first condition) and populate that value to all documents. After debugging, my thought for the fix is like this: This is happening because, in the method SqlEntityProcessor.nextRow(), the query is initialized and loaded only if the the rowIterator is null. Actually, the query should be initialized if the query is different than the previous query. If the logic is changed in that way, i think this issue will be fixed. To apply this logic, change the SqlEntityProcessor.nextRow() method from {code} if (rowIterator == null) { String q = getQuery(); initQuery(context.replaceTokens(q)); } {code} to the code mentioned below: {code} String q = context.replaceTokens(getQuery()); if(!q.equals(this.query)){ initQuery(context.replaceTokens(q)); } {code} Initial testing shows that, it seems working as expected. CachedSqlEntityProcessor with parametarized query is broken --- Key: SOLR-4561 URL: https://issues.apache.org/jira/browse/SOLR-4561 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.1 Reporter: Sudheer Prem Original Estimate: 1m Remaining Estimate: 1m When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4557) Fix broken CoreContainerTest.testReload
[ https://issues.apache.org/jira/browse/SOLR-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600154#comment-13600154 ] Erick Erickson commented on SOLR-4557: -- trunk r: 1455606. fixed the root cause of the tests failing, also took more care with the core reloads so they don't happen simultaneously with loads/unloads. Fix broken CoreContainerTest.testReload --- Key: SOLR-4557 URL: https://issues.apache.org/jira/browse/SOLR-4557 Project: Solr Issue Type: Test Affects Versions: 4.2, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4557.patch, SOLR-4557_posthshutdown_stack.txt I was chasing down a test failure, and it turns out that CoreContainerTest.testReload has only succeeded by chance. The test fires up 4 threads that go out and reload the same core all at once. This caused me to look at properly synchronizing reloading cores pursuant to SOLR-4196, on the theory that we should serialize loading, unloading and reloading cores; we shouldn't be doing _any_ of those operations from different threads on the same core at the same time. It turns out that if you fire up multiple reloads at once without serializing them, an error is thrown instead of proper reloading occurring, and that's the only reason the test doesn't hang. The stack trace of the exception is below for reference, but it doesn't with the code I'll attach to this patch: [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) [junit4:junit4] 2 at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:536) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:138) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.register(RequestHandlers.java:106) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.init(SolrCore.java:757) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.reload(SolrCore.java:408) [junit4:junit4] 2 at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1076) [junit4:junit4] 2 at org.apache.solr.core.TestCoreContainer$1TestThread.run(TestCoreContainer.java:90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4557) Fix broken CoreContainerTest.testReload
[ https://issues.apache.org/jira/browse/SOLR-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600154#comment-13600154 ] Erick Erickson commented on SOLR-4557: -- trunk r: 1455606. fixed the root cause of the tests failing, also took more care with the core reloads so they don't happen simultaneously with loads/unloads. Fix broken CoreContainerTest.testReload --- Key: SOLR-4557 URL: https://issues.apache.org/jira/browse/SOLR-4557 Project: Solr Issue Type: Test Affects Versions: 4.2, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4557.patch, SOLR-4557_posthshutdown_stack.txt I was chasing down a test failure, and it turns out that CoreContainerTest.testReload has only succeeded by chance. The test fires up 4 threads that go out and reload the same core all at once. This caused me to look at properly synchronizing reloading cores pursuant to SOLR-4196, on the theory that we should serialize loading, unloading and reloading cores; we shouldn't be doing _any_ of those operations from different threads on the same core at the same time. It turns out that if you fire up multiple reloads at once without serializing them, an error is thrown instead of proper reloading occurring, and that's the only reason the test doesn't hang. The stack trace of the exception is below for reference, but it doesn't with the code I'll attach to this patch: [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) [junit4:junit4] 2 at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:536) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:138) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.register(RequestHandlers.java:106) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.init(SolrCore.java:757) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.reload(SolrCore.java:408) [junit4:junit4] 2 at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1076) [junit4:junit4] 2 at org.apache.solr.core.TestCoreContainer$1TestThread.run(TestCoreContainer.java:90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599975#comment-13599975 ] Uwe Schindler commented on LUCENE-4713: --- I also opened LUCENE-4823 to make the reloading (which is done on Solr startup to load codecs from plugin folders) more centralized. This is not really related but might move the isParentClassLoader helper method into the new base class for all SPILoaders (and hide it). SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3857) DIH: SqlEntityProcessor with simple cache broken
[ https://issues.apache.org/jira/browse/SOLR-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600088#comment-13600088 ] Sudheer Prem commented on SOLR-3857: Updated SOLR-4561 with a valid fix. DIH: SqlEntityProcessor with simple cache broken -- Key: SOLR-3857 URL: https://issues.apache.org/jira/browse/SOLR-3857 Project: Solr Issue Type: Bug Affects Versions: 3.6.1, 4.0-BETA Reporter: James Dyer The wiki describes a usage of CachedSqlEntityProcessor like this: {code:xml} entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor {code} This creates what the code refers as a simple cache. Rather than build the entire cache up-front, the cache is built on-the-go. I think this has limited use cases but it would be nice to preserve the feature if possible. Unfortunately this was not included in any (effective) unit tests, and SOLR-2382 entirely broke the functionality for 3.6/4.0-alpha+ . At a first glance, the fix may not be entirely straightforward. This was found while writing tests for SOLR-3856. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4562) core selector not working in Chrome
[ https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Lizewski updated SOLR-4562: -- Attachment: Przechwytywanie.PNG core selector not working in Chrome --- Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Maciej Lizewski Attachments: Przechwytywanie.PNG after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4562) core selector not working in Chrome
Maciej Lizewski created SOLR-4562: - Summary: core selector not working in Chrome Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Maciej Lizewski after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4564) Admin UI fails to load properly on Chrome
Aditya created SOLR-4564: Summary: Admin UI fails to load properly on Chrome Key: SOLR-4564 URL: https://issues.apache.org/jira/browse/SOLR-4564 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.2 Environment: Jboss 7.1.1 and Solr 4.2 Reporter: Aditya Admin UI fails to load collection list on Chrome. The dropdown is empty. Clicking on Logging and Threads throws javascript error in console. GET http://10.124.55.84/solr/undefined/admin/logging?wt=jsonsince=0 404 (Not Found) {require.js:10157} GET http://10.124.55.84/solr/undefined/admin/threads?wt=json 404 (Not Found) require.js:10157 Checked on IE9 and the UI looks good. but Schema browser is sluggish. whiel searching fields. Every keystroke creates a pause for field look-up. We have around 290 fields (including dynamic) defined in schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600026#comment-13600026 ] Robert Muir commented on LUCENE-4795: - {quote} You didn't answer my question though, and perhaps it doesn't belong in this issue, but is there a way to utilize the ordinal given to a DV value somehow? Or is it internal to the SortedSet DV? {quote} Because I don't want to encourage crazy software designs to support fringe features. Want weighted faceting? use the tax index: pretty simple. Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh 24.61 (0.7%) 20.74 (1.9%) -15.7% ( -18% - -13%) Fuzzy1 49.40 (2.5%) 43.48 (1.9%) -12.0% ( -15% - -7%) MedSloppyPhrase 27.06 (1.6%) 23.95 (2.3%) -11.5% ( -15% - -7%) MedTerm 51.43 (2.0%) 46.21 (2.7%) -10.2% ( -14% - -5%) IntNRQ4.02 (1.6%)3.63 (4.0%) -9.7% ( -15% - -4%) Wildcard 29.14 (1.5%) 26.46 (2.5%) -9.2% ( -13% - -5%) HighSloppyPhrase0.92 (4.5%)0.87 (5.8%) -5.4% ( -15% -5%) MedSpanNear 29.51 (2.5%) 27.94 (2.2%) -5.3% ( -9% -0%) HighSpanNear3.55 (2.4%)3.38 (2.0%) -4.9% ( -9% -0%) AndHighMed 108.34 (0.9%) 104.55 (1.1%) -3.5% ( -5% - -1%) LowSloppyPhrase 20.50 (2.0%) 20.09 (4.2%) -2.0% ( -8% -4%) LowPhrase 21.60 (6.0%) 21.26 (5.1%) -1.6% ( -11% - 10%) Fuzzy2 53.16 (3.9%) 52.40 (2.7%) -1.4% ( -7% -5%) LowSpanNear8.42 (3.2%)8.45 (3.0%) 0.3% ( -5% -6%) Respell 45.17 (4.3%) 45.38 (4.4%) 0.5% ( -7% -9%) MedPhrase 113.93 (5.8%) 115.02 (4.9%) 1.0% ( -9% - 12%) AndHighLow 596.42 (2.5%) 617.12 (2.8%) 3.5% ( -1% -8%) HighPhrase 17.30 (10.5%) 18.36 (9.1%)
[jira] [Updated] (LUCENE-4642) Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource
[ https://issues.apache.org/jira/browse/LUCENE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated LUCENE-4642: --- Description: All tokenizer implementations have a constructor that takes a given AttributeSource as parameter (LUCENE-1826). These should be removed. TokenizerFactory does not provide an API to create tokenizers with a given AttributeFactory, but quite a few tokenizers have constructors that take an AttributeFactory. TokenizerFactory should add a create(AttributeFactory) method, as should subclasses for tokenizers with AttributeFactory accepting ctors. was: All tokenizer implementations have a constructor that takes a given AttributeSource as parameter (LUCENE-1826). However, the TokenizerFactory does not provide an API to create tokenizers with a given AttributeSource. Side note: There are still a lot of tokenizers that do not provide constructors that take AttributeSource and AttributeFactory. Add create(AttributeFactory) to TokenizerFactory and subclasses with ctors taking AttributeFactory, and remove Tokenizer's and subclasses' ctors taking AttributeSource --- Key: LUCENE-4642 URL: https://issues.apache.org/jira/browse/LUCENE-4642 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Affects Versions: 4.1 Reporter: Renaud Delbru Assignee: Steve Rowe Labels: analysis, attribute, tokenizer Fix For: 4.3 Attachments: LUCENE-4642.patch, LUCENE-4642.patch, LUCENE-4642.patch, TrieTokenizerFactory.java.patch All tokenizer implementations have a constructor that takes a given AttributeSource as parameter (LUCENE-1826). These should be removed. TokenizerFactory does not provide an API to create tokenizers with a given AttributeFactory, but quite a few tokenizers have constructors that take an AttributeFactory. TokenizerFactory should add a create(AttributeFactory) method, as should subclasses for tokenizers with AttributeFactory accepting ctors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 313 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/313/ Java: 64bit/jdk1.7.0 -XX:+UseParallelGC All tests passed Build Log: [...truncated 26442 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] Loading source files for package org.apache.lucene... [javadoc] Loading source files for package org.apache.lucene.analysis... [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene.analysis.tokenattributes... [javadoc] Loading source files for package org.apache.lucene.codecs... [javadoc] Loading source files for package org.apache.lucene.codecs.compressing... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene40... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene41... [javadoc] Loading source files for package org.apache.lucene.codecs.lucene42... [javadoc] Loading source files for package org.apache.lucene.codecs.perfield... [javadoc] Loading source files for package org.apache.lucene.document... [javadoc] Loading source files for package org.apache.lucene.index... [javadoc] Loading source files for package org.apache.lucene.search... [javadoc] Loading source files for package org.apache.lucene.search.payloads... [javadoc] Loading source files for package org.apache.lucene.search.similarities... [javadoc] Loading source files for package org.apache.lucene.search.spans... [javadoc] Loading source files for package org.apache.lucene.store... [javadoc] Loading source files for package org.apache.lucene.util... [javadoc] Loading source files for package org.apache.lucene.util.automaton... [javadoc] Loading source files for package org.apache.lucene.util.fst... [javadoc] Loading source files for package org.apache.lucene.util.mutable... [javadoc] Loading source files for package org.apache.lucene.util.packed... [javadoc] Constructing Javadoc information... [javadoc] Standard Doclet version 1.7.0_15 [javadoc] Building tree for all the packages and classes... [javadoc] Generating /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/org/apache/lucene/search/package-summary.html... [javadoc] Copying file /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-1.png to directory /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Copying file /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/core/src/java/org/apache/lucene/search/doc-files/nrq-formula-2.png to directory /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/org/apache/lucene/search/doc-files... [javadoc] Building index for all the packages and classes... [javadoc] Building index for all classes... [javadoc] Generating /Users/jenkins/jenkins-slave/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/docs/core/help-doc.html... [javadoc] 1 warning [...truncated 33 lines...] [javadoc] Generating Javadoc [javadoc] Javadoc execution [javadoc] warning: [options] bootstrap class path not set in conjunction with -source 1.6 [javadoc] Loading source files for package org.apache.lucene.analysis.ar... [javadoc] Loading source files for package org.apache.lucene.analysis.bg... [javadoc] Loading source files for package org.apache.lucene.analysis.br... [javadoc] Loading source files for package org.apache.lucene.analysis.ca... [javadoc] Loading source files for package org.apache.lucene.analysis.charfilter... [javadoc] Loading source files for package org.apache.lucene.analysis.cjk... [javadoc] Loading source files for package org.apache.lucene.analysis.commongrams... [javadoc] Loading source files for package org.apache.lucene.analysis.compound... [javadoc] Loading source files for package org.apache.lucene.analysis.compound.hyphenation... [javadoc] Loading source files for package org.apache.lucene.analysis.core... [javadoc] Loading source files for package org.apache.lucene.analysis.cz... [javadoc] Loading source files for package org.apache.lucene.analysis.da... [javadoc] Loading source files for package org.apache.lucene.analysis.de... [javadoc] Loading source files for package org.apache.lucene.analysis.el... [javadoc] Loading source files for package org.apache.lucene.analysis.en... [javadoc] Loading source files for package org.apache.lucene.analysis.es... [javadoc] Loading source files for package org.apache.lucene.analysis.eu... [javadoc] Loading source files for package org.apache.lucene.analysis.fa... [javadoc] Loading source files for package org.apache.lucene.analysis.fi... [javadoc] Loading source files for package org.apache.lucene.analysis.fr... [javadoc] Loading source files for package
[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1350#comment-1350 ] Shai Erera commented on LUCENE-4795: Thanks. Also (sorry that it comes in parts), I find this confusing: {{new SortedSetDocValuesField(myfacets, new BytesRef(a + sep + foo))}}. The user needs to decide under which field all facets will be indexed. This could lead users to do {{new SSDVF(author, new BytesRef(shai))}} and {{new SSDVF(date, new BytesRef(2010/March/13))}}. We know, from past results, that this will result in worse search performance. Also, this doesn't take a CP which is not consistent e.g. with the FacetRequest, where you need to pass a CP. So rather perhaps we should: * Add a FacetField (extends SSDVF) which takes a CP (potentially FacetIndexingParams as well). * It will call super(CLP.DEFAULT_FIELD, new BytesRef(cp.toString())) (we can optimize that later, e.g. have CP expose a BytesRef API too if we want). * Potentially, allow (or not) to define the field type. What do you think? Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh 24.61 (0.7%) 20.74 (1.9%) -15.7% ( -18% - -13%) Fuzzy1 49.40 (2.5%) 43.48 (1.9%) -12.0% ( -15% - -7%) MedSloppyPhrase 27.06 (1.6%) 23.95 (2.3%) -11.5% ( -15% - -7%) MedTerm 51.43 (2.0%) 46.21 (2.7%) -10.2% ( -14% - -5%) IntNRQ4.02 (1.6%)3.63 (4.0%) -9.7% ( -15% - -4%) Wildcard 29.14 (1.5%) 26.46 (2.5%) -9.2% ( -13% - -5%) HighSloppyPhrase0.92 (4.5%)0.87 (5.8%) -5.4% ( -15% -5%) MedSpanNear 29.51 (2.5%) 27.94 (2.2%) -5.3% ( -9% -0%) HighSpanNear3.55 (2.4%)3.38 (2.0%) -4.9% ( -9% -0%) AndHighMed 108.34 (0.9%) 104.55 (1.1%) -3.5% ( -5% - -1%) LowSloppyPhrase 20.50 (2.0%) 20.09 (4.2%) -2.0% ( -8% -4%) LowPhrase 21.60 (6.0%) 21.26 (5.1%) -1.6% ( -11% - 10%) Fuzzy2 53.16 (3.9%)
[jira] [Comment Edited] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken
[ https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600078#comment-13600078 ] Sudheer Prem edited comment on SOLR-4561 at 3/12/13 5:06 PM: - I have a scenario where table A contain 5 million rows and table B contain more than a million rows. The join condition matches for only a couple of thousands of records. I had been using this feature in earlier version of Solr. Suddenly due to this change, it took the wrong join (one which matches the first condition) and populate that value to all documents. After debugging, my thought for the fix is like this: This is happening because, in the method SqlEntityProcessor.nextRow(), the query is initialized and loaded only if the the rowIterator is null. Actually, the query can be initialized if the query is different than the previous query. If the logic is changed in that way, i think this issue will be fixed. To apply this logic, change the SqlEntityProcessor.nextRow() method from {code} if (rowIterator == null) { String q = getQuery(); initQuery(context.replaceTokens(q)); } {code} to the code mentioned below: {code} String q = context.replaceTokens(getQuery()); if(!q.equals(this.query)){ initQuery(q); } {code} Initial testing shows that, it seems working as expected. was (Author: sudheerprem): I have a scenario where table A contain 5 million rows and table B contain more than a million rows. The join condition matches for only a couple of thousands of records. I had been using this feature in earlier version of Solr. Suddenly due to this change, it took the wrong join (one which matches the first condition) and populate that value to all documents. After debugging, my thought for the fix is like this: This is happening because, in the method SqlEntityProcessor.nextRow(), the query is initialized and loaded only if the the rowIterator is null. Actually, the query should be initialized if the query is different than the previous query. If the logic is changed in that way, i think this issue will be fixed. To apply this logic, change the SqlEntityProcessor.nextRow() method from {code} if (rowIterator == null) { String q = getQuery(); initQuery(context.replaceTokens(q)); } {code} to the code mentioned below: {code} String q = context.replaceTokens(getQuery()); if(!q.equals(this.query)){ initQuery(q); } {code} Initial testing shows that, it seems working as expected. CachedSqlEntityProcessor with parametarized query is broken --- Key: SOLR-4561 URL: https://issues.apache.org/jira/browse/SOLR-4561 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.1 Reporter: Sudheer Prem Original Estimate: 1m Remaining Estimate: 1m When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4561) CachedSqlEntityProcessor with parametarized query is broken
[ https://issues.apache.org/jira/browse/SOLR-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600078#comment-13600078 ] Sudheer Prem commented on SOLR-4561: I have a scenario where table A contain 5 million rows and table B contain more than a million rows. The join condition matches for only a couple of thousands of records. I had been using this feature in earlier version of Solr. Suddenly due to this change, it took the wrong join (one which matches the first condition) and populate that value to all documents. After debugging, my thought for the fix is like this: This is happening because, in the method SqlEntityProcessor.nextRow(), the query is initialized and loaded only if the the rowIterator is null. Actually, the query should be initialized if the query is different than the previous query. If the logic is changed in that way, i think this issue will be fixed. To apply this logic, change the SqlEntityProcessor.nextRow() method from {code} if (rowIterator == null) { String q = getQuery(); initQuery(context.replaceTokens(q)); } {code} to the code mentioned below: {code} String q = context.replaceTokens(getQuery()); if(!q.equals(this.query)){ initQuery(context.replaceTokens(q)); } {code} Initial testing shows that, it seems working as expected. CachedSqlEntityProcessor with parametarized query is broken --- Key: SOLR-4561 URL: https://issues.apache.org/jira/browse/SOLR-4561 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.1 Reporter: Sudheer Prem Original Estimate: 1m Remaining Estimate: 1m When child entities are created and the child entity is provided with a parametrized query as below, {code:xml} entity name=x query=select * from x entity name=y query=select * from y where xid=${x.id} processor=CachedSqlEntityProcessor /entity entity {code} the Entity Processor always return the result from the fist query even though the parameter is changed, It is happening because, EntityProcessorBase.getNext() method doesn't reset the query and rowIterator after calling DIHCacheSupport.getCacheData() method. This can be fixed by changing the else block in getNext() method of EntityProcessorBase from {code} else { return cacheSupport.getCacheData(context, query, rowIterator); } {code} to the code mentioned below: {code} else { MapString,Object cacheData = cacheSupport.getCacheData(context, query, rowIterator); query = null; rowIterator = null; return cacheData; } {code} Update: But then, the caching doesn't seem to be working... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4557) Fix broken CoreContainerTest.testReload
[ https://issues.apache.org/jira/browse/SOLR-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-4557: - Attachment: SOLR-4557.patch Fix for trunk corresponding to the checkin. Fix broken CoreContainerTest.testReload --- Key: SOLR-4557 URL: https://issues.apache.org/jira/browse/SOLR-4557 Project: Solr Issue Type: Test Affects Versions: 4.2, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Attachments: SOLR-4557.patch, SOLR-4557.patch, SOLR-4557_posthshutdown_stack.txt I was chasing down a test failure, and it turns out that CoreContainerTest.testReload has only succeeded by chance. The test fires up 4 threads that go out and reload the same core all at once. This caused me to look at properly synchronizing reloading cores pursuant to SOLR-4196, on the theory that we should serialize loading, unloading and reloading cores; we shouldn't be doing _any_ of those operations from different threads on the same core at the same time. It turns out that if you fire up multiple reloads at once without serializing them, an error is thrown instead of proper reloading occurring, and that's the only reason the test doesn't hang. The stack trace of the exception is below for reference, but it doesn't with the code I'll attach to this patch: [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) [junit4:junit4] 2 at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:536) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:138) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.register(RequestHandlers.java:106) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.init(SolrCore.java:757) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.reload(SolrCore.java:408) [junit4:junit4] 2 at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1076) [junit4:junit4] 2 at org.apache.solr.core.TestCoreContainer$1TestThread.run(TestCoreContainer.java:90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4563) RSS DIH-example not working
Jan Høydahl created SOLR-4563: - Summary: RSS DIH-example not working Key: SOLR-4563 URL: https://issues.apache.org/jira/browse/SOLR-4563 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Jan Høydahl Fix For: 4.3, 5.0 Attachments: SOLR-4563.patch The xpath paths of /rss/item do not match the real world RSS feed which uses /rss/channel/item -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4713) SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails
[ https://issues.apache.org/jira/browse/LUCENE-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599977#comment-13599977 ] Uwe Schindler commented on LUCENE-4713: --- bq. Nit: What you could do to be 100% safe that we're using the correct ClassLoader is to check for loader==null in SPIClassIterator and assign it to ClassLoader.getSystemClassLoader() in this case. I want to keep as close to Java's original. This is not a problem at all: Class.forName(name, ..., NULL) loads automatically using the bootstrap / system loader. SPI: Allow fallback to default ClassLoader if Thread#getContextClassLoader fails Key: LUCENE-4713 URL: https://issues.apache.org/jira/browse/LUCENE-4713 Project: Lucene - Core Issue Type: Improvement Affects Versions: 4.0, 4.1, 4.2 Reporter: Christian Kohlschütter Assignee: Uwe Schindler Priority: Minor Labels: ClassLoader, Thread Fix For: 4.3 Attachments: LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LUCENE-4713.patch, LuceneContextClassLoader.patch NOTE: This issue has been renamed from: Replace calls to Thread#getContextClassLoader with the ClassLoader of the current class because the revised patch provides a clean fallback path. I am not sure whether it is a design decision or if we can indeed consider this a bug: In core and analysis-common some classes provide on-demand class loading using SPI. In NamedSPILoader, SPIClassIterator, ClasspathResourceLoader and AnalysisSPILoader there are constructors that use the Thread's context ClassLoader by default whenever no particular other ClassLoader was specified. Unfortunately this does not work as expected when the Thread's ClassLoader can't see the required classes that are instantiated downstream with the help of Class.forName (e.g., Codecs, Analyzers, etc.). That's what happened to us here. We currently experiment with running Lucene 2.9 and 4.x in one JVM, both being separated by custom ClassLoaders, each seeing only the corresponding Lucene version and the upstream classpath. While NamedSPILoader and company get successfully loaded by our custom ClassLoader, their instantiation fails because our Thread's Context-ClassLoader cannot find the additionally required classes. We could probably work-around this by using Thread#setContextClassLoader at construction time (and quickly reverting back afterwards), but I have the impression this might just hide the actual problem and cause further trouble when lazy-loading classes later on, and potentially from another Thread. Removing the call to Thread#getContextClassLoader would also align with the behavior of AttributeSource.DEFAULT_ATTRIBUTE_FACTORY, which in fact uses Attribute#getClass().getClassLoader() instead. A simple patch is attached. All tests pass. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [ANNOUNCE] Apache Solr 4.2 released
We presently have Indexes generated from Solr 4.1. What is the upgrade path to Solr 4.2 ? On 3/11/13 8:37 PM, Robert Muir rm...@apache.org wrote: March 2013, Apache Solr 4.2 available The Lucene PMC is pleased to announce the release of Apache Solr 4.2 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites. Solr 4.2 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html See the CHANGES.txt file included with the release for a full list of details. Solr 4.2 Release Highlights: * A read side REST API for the schema. Always wanted to introspect the schema over http? Now you can. Looks like the write side will be coming next. * DocValues have been integrated into Solr. DocValues can be loaded up a lot faster than the field cache and can also use different compression algorithms as well as in RAM or on Disk representations. Faceting, sorting, and function queries all get to benefit. How about the OS handling faceting and sorting caches off heap? No more tuning 60 gigabyte heaps? How about a snappy new per segment DocValues faceting method? Improved numeric faceting? Sweet. * Collection Aliasing. Got time based data? Want to re-index in a temporary collection and then swap it into production? Done. Stay tuned for Shard Aliasing. * Collection API responses. The collections API was still very new in 4.0, and while it improved a fair bit in 4.1, responses were certainly needed, but missed the cut off. Initially, we made the decision to make the Collection API super fault tolerant, which made responses tougher to do. No one wants to hunt through logs files to see how things turned out. Done in 4.2. * Interact with any collection on any node. Until 4.2, you could only interact with a node in your cluster if it hosted at least one replica of the collection you wanted to query/update. No longer - query any node, whether it has a piece of your intended collection or not and get a proxied response. * Allow custom shard names so that new host addresses can take over for retired shards. Working on Amazon without elastic ips? This is for you. * Lucene 4.2 optimizations such as compressed term vectors. Solr 4.2 also includes many other new features as well as numerous optimizations and bugfixes. Please report any feedback to the mailing lists (http://lucene.apache.org/solr/discussion.html) Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. Happy searching, Lucene/Solr developers - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4563) RSS DIH-example not working
[ https://issues.apache.org/jira/browse/SOLR-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600234#comment-13600234 ] Walter Underwood commented on SOLR-4563: Given the wild variety of things called RSS, it is probably a better idea to parse Atom. RSS DIH-example not working --- Key: SOLR-4563 URL: https://issues.apache.org/jira/browse/SOLR-4563 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Jan Høydahl Fix For: 4.3, 5.0 Attachments: SOLR-4563.patch The xpath paths of /rss/item do not match the real world RSS feed which uses /rss/channel/item -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4562) core selector not working in Chrome
[ https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600249#comment-13600249 ] Mark Miller commented on SOLR-4562: --- I've seen this work on chrome in linux and osx as a data point. core selector not working in Chrome --- Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Maciej Lizewski Attachments: Przechwytywanie.PNG after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Improving DirectSpellChecker
On Tue, Mar 12, 2013 at 9:39 AM, Varun Thacker varunthacker1...@gmail.com wrote: Actually that was what I ended up doing although I thought this approach could have it's merits. Just for argument's sake, if we could have complex analyzers on a field wouldn't it have better recall for spell suggestions sacrificing on the precision although. Would that be a bad idea? Also DirectSpellChecker is probably not where this should be in. Maybe in SpellChecker or a new spell checker. Or do you think it's possible that something like this should sit outside lucene. I think the idea makes sense (basically it would be like analyzing/fuzzysuggester, but for spellchecking?) So it could use maybe even the same datastructures but different logic. This means someone could use it to do spellchecking (not just suggest) on languages like japanese too. So this would be a really nice option to add in my opinion. But directspellchecker is pretty simple and limited essentially by what the term dictionary can do. So you cant use fancy datastructures like FST weights, thats why i was confused about the email. The overall approach is a good idea though. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4564) Admin UI fails to load properly on Chrome
[ https://issues.apache.org/jira/browse/SOLR-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600262#comment-13600262 ] Steve Rowe commented on SOLR-4564: -- I think this is a duplicate of SOLR-4562 - Aditya, what version of Windows? Admin UI fails to load properly on Chrome - Key: SOLR-4564 URL: https://issues.apache.org/jira/browse/SOLR-4564 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.2 Environment: Jboss 7.1.1 and Solr 4.2 Reporter: Aditya Admin UI fails to load collection list on Chrome. The dropdown is empty. Clicking on Logging and Threads throws javascript error in console. GET http://10.124.55.84/solr/undefined/admin/logging?wt=jsonsince=0 404 (Not Found) {require.js:10157} GET http://10.124.55.84/solr/undefined/admin/threads?wt=json 404 (Not Found) require.js:10157 Checked on IE9 and the UI looks good. but Schema browser is sluggish. whiel searching fields. Every keystroke creates a pause for field look-up. We have around 290 fields (including dynamic) defined in schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4562) core selector not working in Chrome
[ https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600263#comment-13600263 ] Steve Rowe commented on SOLR-4562: -- SOLR-4564 looks like it's the same issue. core selector not working in Chrome --- Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Maciej Lizewski Attachments: Przechwytywanie.PNG after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4562) core selector not working in Chrome
[ https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600290#comment-13600290 ] Stefan Matheis (steffkes) commented on SOLR-4562: - [~redguy666] Did you upgrade from an earlier version? If so, can you try to clear your browser-cache? We had this [Thread|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201303.mbox/%3ccaeemfb210ehgc9v5cjgj6yrjrkdwg+9roqpevfk4jtaq4tk...@mail.gmail.com%3E] on the list two weeks ago and that solved the Problem core selector not working in Chrome --- Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Affects Versions: 4.2 Reporter: Maciej Lizewski Attachments: Przechwytywanie.PNG after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4564) Admin UI fails to load properly on Chrome
[ https://issues.apache.org/jira/browse/SOLR-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600296#comment-13600296 ] Stefan Matheis (steffkes) commented on SOLR-4564: - [~abakle] Did you upgrade from an earlier version? If so, can you try to clear your browser-cache? We had this [Thread|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201303.mbox/%3ccaeemfb210ehgc9v5cjgj6yrjrkdwg+9roqpevfk4jtaq4tk...@mail.gmail.com%3E] on the list two weeks ago and that solved the Problem For the Schema-Browser: Would you mind opening another/separate Issue and include the Output of {{/solr/collection1/admin/luke?numTerms=0wt=json}} and {{/solr/collection1/admin/luke?show=schemawt=json}} as attachment? That would simplify the testing with a real-world configuration Admin UI fails to load properly on Chrome - Key: SOLR-4564 URL: https://issues.apache.org/jira/browse/SOLR-4564 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.2 Environment: Jboss 7.1.1 and Solr 4.2 Reporter: Aditya Admin UI fails to load collection list on Chrome. The dropdown is empty. Clicking on Logging and Threads throws javascript error in console. GET http://10.124.55.84/solr/undefined/admin/logging?wt=jsonsince=0 404 (Not Found) {require.js:10157} GET http://10.124.55.84/solr/undefined/admin/threads?wt=json 404 (Not Found) require.js:10157 Checked on IE9 and the UI looks good. but Schema browser is sluggish. whiel searching fields. Every keystroke creates a pause for field look-up. We have around 290 fields (including dynamic) defined in schema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4562) core selector not working in Chrome
[ https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-4562: Component/s: web gui core selector not working in Chrome --- Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.2 Reporter: Maciej Lizewski Attachments: Przechwytywanie.PNG after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600324#comment-13600324 ] Greg Bowyer commented on SOLR-4465: --- Does the CollectorSpec serve the same purpose as say the GroupingSpecification, that is to provide underlying collectors (and the search in general) with the right requirements information. I ask because maybe it would be easier to make the CollectorSpec support a map of String - Object or String - CollectorProperty I am trying to think how we can do grouping with this. but I might have misinterpreted what its for Configurable Collectors --- Key: SOLR-4465 URL: https://issues.apache.org/jira/browse/SOLR-4465 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.1 Reporter: Joel Bernstein Fix For: 4.3 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch This issue is to add configurable custom collectors to Solr. This expands the design and work done in issue SOLR-1680 to include: 1) CollectorFactory configuration in solconfig.xml 2) Http parameters to allow clients to dynamically select a CollectorFactory and construct a custom Collector. 3) Make aspects of QueryComponent pluggable so that the output from distributed search can conform with custom collectors at the shard level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4824) Query time join returns different results based on the field type
Akos Kitta created LUCENE-4824: -- Summary: Query time join returns different results based on the field type Key: LUCENE-4824 URL: https://issues.apache.org/jira/browse/LUCENE-4824 Project: Lucene - Core Issue Type: Bug Components: modules/join Affects Versions: 4.1 Reporter: Akos Kitta I'm experiencing different query time joining behavior based on the type of the 'toField' and 'fromField'. Basically I got correct results when both 'toField' and 'fromField' are StringField, but incorrect in case of LongField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4824) Query time join returns different results based on the field type
[ https://issues.apache.org/jira/browse/LUCENE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akos Kitta updated LUCENE-4824: --- Attachment: QueryTimeJoinTest.java Attaching simple test case. Query time join returns different results based on the field type - Key: LUCENE-4824 URL: https://issues.apache.org/jira/browse/LUCENE-4824 Project: Lucene - Core Issue Type: Bug Components: modules/join Affects Versions: 4.1 Reporter: Akos Kitta Labels: newbie Attachments: QueryTimeJoinTest.java I'm experiencing different query time joining behavior based on the type of the 'toField' and 'fromField'. Basically I got correct results when both 'toField' and 'fromField' are StringField, but incorrect in case of LongField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4824) Query time join returns different results based on the field type
[ https://issues.apache.org/jira/browse/LUCENE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akos Kitta updated LUCENE-4824: --- Attachment: (was: QueryTimeJoinTest.java) Query time join returns different results based on the field type - Key: LUCENE-4824 URL: https://issues.apache.org/jira/browse/LUCENE-4824 Project: Lucene - Core Issue Type: Bug Components: modules/join Affects Versions: 4.1 Reporter: Akos Kitta Labels: newbie Attachments: QueryTimeJoinTest.java I'm experiencing different query time joining behavior based on the type of the 'toField' and 'fromField'. Basically I got correct results when both 'toField' and 'fromField' are StringField, but incorrect in case of LongField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4824) Query time join returns different results based on the field type
[ https://issues.apache.org/jira/browse/LUCENE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akos Kitta updated LUCENE-4824: --- Attachment: QueryTimeJoinTest.java Query time join returns different results based on the field type - Key: LUCENE-4824 URL: https://issues.apache.org/jira/browse/LUCENE-4824 Project: Lucene - Core Issue Type: Bug Components: modules/join Affects Versions: 4.1 Reporter: Akos Kitta Labels: newbie Attachments: QueryTimeJoinTest.java I'm experiencing different query time joining behavior based on the type of the 'toField' and 'fromField'. Basically I got correct results when both 'toField' and 'fromField' are StringField, but incorrect in case of LongField. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4562) core selector not working in Chrome
[ https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600352#comment-13600352 ] Maciej Lizewski commented on SOLR-4562: --- You were right. After clearing browser cache everything is working ok. Sorry for duplicate issue - I search for something similar but did not found that one. funny thing is that earlier I tried refreshing page with SHIFT which *should* reload all resources from server... :) core selector not working in Chrome --- Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.2 Reporter: Maciej Lizewski Attachments: Przechwytywanie.PNG after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4562) core selector not working in Chrome
[ https://issues.apache.org/jira/browse/SOLR-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Lizewski resolved SOLR-4562. --- Resolution: Not A Problem core selector not working in Chrome --- Key: SOLR-4562 URL: https://issues.apache.org/jira/browse/SOLR-4562 Project: Solr Issue Type: Bug Components: web gui Affects Versions: 4.2 Reporter: Maciej Lizewski Attachments: Przechwytywanie.PNG after fresh installation of Solr 4.2 on windows 7 64bit I do not see any cores in Google Chrome to select in combobox. Also - when trying to prepare URI by hand - I see error that there is no such core. In FireFox - there is default 'collection1' core visible without problems. My Chrome version: 26.0.1410.28 beta-m I cannot se any errors in JS console... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4795) Add FacetsCollector based on SortedSetDocValues
[ https://issues.apache.org/jira/browse/LUCENE-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600355#comment-13600355 ] Michael McCandless commented on LUCENE-4795: {quote} So rather perhaps we should: * Add a FacetField (extends SSDVF) which takes a CP (potentially FacetIndexingParams as well). * It will call super(CLP.DEFAULT_FIELD, new BytesRef(cp.toString())) (we can optimize that later, e.g. have CP expose a BytesRef API too if we want). * Potentially, allow (or not) to define the field type. {quote} I agree it's awkward now. But ... FacetField makes me nervous, just because it's too close to FacetFields and users may think they can mix match the two approaches. It's trappy ... maybe SortedSetDocValuesFacetField instead? But you'd need to provide it with this separator... hmm, or maybe we can use the same sep as FIP. Separately, I wonder whether facet module should escape the delimiter when it appears in a cat path label, in general (and, here)? This way the app does not have to ensure it never appears in any label (which I think is tricky for some apps to do, eg a search server like ElasticSearch/Solr can't do this). bq. Any reason why you don't get a hold of the returned FRN? I wanted to keep it simple for starters ... but I'll fix to reuse the rejected entry. Add FacetsCollector based on SortedSetDocValues --- Key: LUCENE-4795 URL: https://issues.apache.org/jira/browse/LUCENE-4795 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Michael McCandless Attachments: LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, LUCENE-4795.patch, pleaseBenchmarkMe.patch Recently (LUCENE-4765) we added multi-valued DocValues field (SortedSetDocValuesField), and this can be used for faceting in Solr (SOLR-4490). I think we should also add support in the facet module? It'd be an option with different tradeoffs. Eg, it wouldn't require the taxonomy index, since the main index handles label/ord resolving. There are at least two possible approaches: * On every reopen, build the seg - global ord map, and then on every collect, get the seg ord, map it to the global ord space, and increment counts. This adds cost during reopen in proportion to number of unique terms ... * On every collect, increment counts based on the seg ords, and then do a merge in the end just like distributed faceting does. The first approach is much easier so I built a quick prototype using that. The prototype does the counting, but it does NOT do the top K facets gathering in the end, and it doesn't know parent/child ord relationships, so there's tons more to do before this is real. I also was unsure how to properly integrate it since the existing classes seem to expect that you use a taxonomy index to resolve ords. I ran a quick performance test. base = trunk except I disabled the compute top-K in FacetsAccumulator to make the comparison fair; comp = using the prototype collector in the patch: {noformat} TaskQPS base StdDevQPS comp StdDev Pct diff OrHighLow 18.79 (2.5%) 14.36 (3.3%) -23.6% ( -28% - -18%) HighTerm 21.58 (2.4%) 16.53 (3.7%) -23.4% ( -28% - -17%) OrHighMed 18.20 (2.5%) 13.99 (3.3%) -23.2% ( -28% - -17%) Prefix3 14.37 (1.5%) 11.62 (3.5%) -19.1% ( -23% - -14%) LowTerm 130.80 (1.6%) 106.95 (2.4%) -18.2% ( -21% - -14%) OrHighHigh9.60 (2.6%)7.88 (3.5%) -17.9% ( -23% - -12%) AndHighHigh 24.61 (0.7%) 20.74 (1.9%) -15.7% ( -18% - -13%) Fuzzy1 49.40 (2.5%) 43.48 (1.9%) -12.0% ( -15% - -7%) MedSloppyPhrase 27.06 (1.6%) 23.95 (2.3%) -11.5% ( -15% - -7%) MedTerm 51.43 (2.0%) 46.21 (2.7%) -10.2% ( -14% - -5%) IntNRQ4.02 (1.6%)3.63 (4.0%) -9.7% ( -15% - -4%) Wildcard 29.14 (1.5%) 26.46 (2.5%) -9.2% ( -13% - -5%) HighSloppyPhrase0.92 (4.5%)0.87 (5.8%) -5.4% ( -15% -5%) MedSpanNear 29.51 (2.5%) 27.94 (2.2%) -5.3% ( -9% -0%) HighSpanNear3.55 (2.4%)3.38 (2.0%) -4.9% ( -9% -0%) AndHighMed 108.34 (0.9%) 104.55 (1.1%)
[jira] [Created] (LUCENE-4825) PostingsHighlighter support for positional queries
Luca Cavanna created LUCENE-4825: Summary: PostingsHighlighter support for positional queries Key: LUCENE-4825 URL: https://issues.apache.org/jira/browse/LUCENE-4825 Project: Lucene - Core Issue Type: Improvement Components: modules/highlighter Affects Versions: 4.2 Reporter: Luca Cavanna I've been playing around with the brand new PostingsHighlighter. I'm really happy with the result in terms of quality of the snippets and performance. On the other hand, I noticed it doesn't support positional queries. If you make a span query, for example, all the single terms will be highlighted, even though they haven't contributed to the match. That reminds me of the difference between the QueryTermScorer and the QueryScorer (using the standard Highlighter). I've been trying to adapt what the QueryScorer does, especially the extraction of the query terms together with their positions (what WeightedSpanTermExtractor does). Next step would be to take that information into account within the formatter and highlight only the terms that actually contributed to the match. I'm not quite ready yet with a patch to contribute this back, but I certainly intend to do so. That's why I opened the issue and in the meantime I would like to hear what you guys think about it and discuss how best we can fix it. I think it would be a big improvement for this new highlighter, which is already great! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages
Michael McCandless created LUCENE-4826: -- Summary: PostingsHighlighter doesn't keep the top N best scoring passages Key: LUCENE-4826 URL: https://issues.apache.org/jira/browse/LUCENE-4826 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4826.patch The comparator we pass to the PQ is just backwards ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages
[ https://issues.apache.org/jira/browse/LUCENE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-4826: --- Attachment: LUCENE-4826.patch PostingsHighlighter doesn't keep the top N best scoring passages Key: LUCENE-4826 URL: https://issues.apache.org/jira/browse/LUCENE-4826 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4826.patch The comparator we pass to the PQ is just backwards ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4825) PostingsHighlighter support for positional queries
[ https://issues.apache.org/jira/browse/LUCENE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600404#comment-13600404 ] Robert Muir commented on LUCENE-4825: - I think it supports positional queries, just in a different way. I don't really like the way the standardhighlighter does this myself. I would prefer if we avoided the slow stuff those things do in this highlighter (because we already have other ones that do that). This one instead puts more effort on trying to summarize the document with respect to the query terms (which is faster, and for some cases, better use of cpu time). I think a good improvement would be to letting the proximity of terms within passages influence the scoring. Its not necessary to actually gather anything about the query to do this and wouldnt be confusing and would still support all queries that support extractTerms(). On the other hand we can always create variants of this highlighter that do as you suggest, so that it leaves the user with more choices. But I just would prefer we don't try to force PostingsHighlighter work like the other highlighters for the reasons i mentioned. PostingsHighlighter support for positional queries -- Key: LUCENE-4825 URL: https://issues.apache.org/jira/browse/LUCENE-4825 Project: Lucene - Core Issue Type: Improvement Components: modules/highlighter Affects Versions: 4.2 Reporter: Luca Cavanna I've been playing around with the brand new PostingsHighlighter. I'm really happy with the result in terms of quality of the snippets and performance. On the other hand, I noticed it doesn't support positional queries. If you make a span query, for example, all the single terms will be highlighted, even though they haven't contributed to the match. That reminds me of the difference between the QueryTermScorer and the QueryScorer (using the standard Highlighter). I've been trying to adapt what the QueryScorer does, especially the extraction of the query terms together with their positions (what WeightedSpanTermExtractor does). Next step would be to take that information into account within the formatter and highlight only the terms that actually contributed to the match. I'm not quite ready yet with a patch to contribute this back, but I certainly intend to do so. That's why I opened the issue and in the meantime I would like to hear what you guys think about it and discuss how best we can fix it. I think it would be a big improvement for this new highlighter, which is already great! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-4465: - Attachment: SOLR-4465.patch Added support for delegating collectors. This design allows for a topdocs collector to be wrapped by delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. Configurable Collectors --- Key: SOLR-4465 URL: https://issues.apache.org/jira/browse/SOLR-4465 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.1 Reporter: Joel Bernstein Fix For: 4.3 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch This issue is to add configurable custom collectors to Solr. This expands the design and work done in issue SOLR-1680 to include: 1) CollectorFactory configuration in solconfig.xml 2) Http parameters to allow clients to dynamically select a CollectorFactory and construct a custom Collector. 3) Make aspects of QueryComponent pluggable so that the output from distributed search can conform with custom collectors at the shard level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages
[ https://issues.apache.org/jira/browse/LUCENE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600411#comment-13600411 ] Robert Muir commented on LUCENE-4826: - +1! Here is a smaller test: in order to trick it to fail, you must have something like Great Sentence. Crappy Sentence. Good Sentence. otherwise they never make it into the PQ to demonstrate the bug... {code} public void testPassageRanking() throws Exception { Directory dir = newDirectory(); IndexWriterConfig iwc = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random(), MockTokenizer.SIMPLE, true)); iwc.setMergePolicy(newLogMergePolicy()); RandomIndexWriter iw = new RandomIndexWriter(random(), dir, iwc); FieldType offsetsType = new FieldType(TextField.TYPE_STORED); offsetsType.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS); Field body = new Field(body, , offsetsType); Document doc = new Document(); doc.add(body); body.setStringValue(This is a test. Just highlighting from postings. This is also a much sillier test. Feel free to test test test test test test test.); iw.addDocument(doc); IndexReader ir = iw.getReader(); iw.close(); IndexSearcher searcher = newSearcher(ir); PostingsHighlighter highlighter = new PostingsHighlighter(); Query query = new TermQuery(new Term(body, test)); TopDocs topDocs = searcher.search(query, null, 10, Sort.INDEXORDER); assertEquals(1, topDocs.totalHits); String snippets[] = highlighter.highlight(body, query, searcher, topDocs, 2); assertEquals(1, snippets.length); assertEquals(This is a btest/b. ... Feel free to btest/b btest/b btest/b btest/b btest/b btest/b btest/b., snippets[0]); ir.close(); dir.close(); } {code} PostingsHighlighter doesn't keep the top N best scoring passages Key: LUCENE-4826 URL: https://issues.apache.org/jira/browse/LUCENE-4826 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4826.patch The comparator we pass to the PQ is just backwards ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4565) Extend NorwegianMinimalStemFilter to handle nynorsk
Jan Høydahl created SOLR-4565: - Summary: Extend NorwegianMinimalStemFilter to handle nynorsk Key: SOLR-4565 URL: https://issues.apache.org/jira/browse/SOLR-4565 Project: Solr Issue Type: Improvement Components: Schema and Analysis Reporter: Jan Høydahl Norway has two official languages, both called Norwegian, namely Bokmål (nb_NO) and Nynorsk (nn_NO). The NorwegianMinimalStemFilter and NorwegianLightStemFilter today only works with the largest of the two, namely Bokmål. Propose to incorporate nn support through a new vaiant config option: * variant=nb or not configured - Bokmål as today * variant=nn - Nynorsk only * variant=no - Remove stems for both nb and nn -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600410#comment-13600410 ] Joel Bernstein edited comment on SOLR-4465 at 3/12/13 8:18 PM: --- Added support for delegating collectors. This design allows for a topdocs collector to be wrapped by delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. Now there are two collect parameters: cl.topdocs=topdocs collector name cl.delegating=comma separated list of delegating collectors Both of these parameters refer to collectorFactories configured in the solrconfig.xml. Parameters are passed to the collectors by name. For example: cl.topdocs=defaultcl.delegating=sumcl.sum.groupby=field1cl.sum.column=field2 In this example the topdocs collector is the default. The delegating collector is the sum collector. The sum collector is being passed two parameters groupby and column, telling it to groupby field1 and sum field2. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. A simple example delegating collector to test the entire flow needs to be created. was (Author: joel.bernstein): Added support for delegating collectors. This design allows for a topdocs collector to be wrapped by delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. Configurable Collectors --- Key: SOLR-4465 URL: https://issues.apache.org/jira/browse/SOLR-4465 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.1 Reporter: Joel Bernstein Fix For: 4.3 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch This issue is to add configurable custom collectors to Solr. This expands the design and work done in issue SOLR-1680 to include: 1) CollectorFactory configuration in solconfig.xml 2) Http parameters to allow clients to dynamically select a CollectorFactory and construct a custom Collector. 3) Make aspects of QueryComponent pluggable so that the output from distributed search can conform with custom collectors at the shard level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600410#comment-13600410 ] Joel Bernstein edited comment on SOLR-4465 at 3/12/13 8:19 PM: --- Added support for delegating collectors. This design allows for a topdocs collector to be wrapped by delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. Now there are two collector parameters: cl.topdocs=topdocs collector name cl.delegating=comma separated list of delegating collectors Both of these parameters refer to collectorFactories configured in the solrconfig.xml. Parameters are passed to the collectors by name. For example: cl.topdocs=defaultcl.delegating=sumcl.sum.groupby=field1cl.sum.column=field2 In this example the topdocs collector is the default. The delegating collector is the sum collector. The sum collector is being passed two parameters groupby and column, telling it to groupby field1 and sum field2. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. A simple example delegating collector to test the entire flow needs to be created. was (Author: joel.bernstein): Added support for delegating collectors. This design allows for a topdocs collector to be wrapped by delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. Now there are two collect parameters: cl.topdocs=topdocs collector name cl.delegating=comma separated list of delegating collectors Both of these parameters refer to collectorFactories configured in the solrconfig.xml. Parameters are passed to the collectors by name. For example: cl.topdocs=defaultcl.delegating=sumcl.sum.groupby=field1cl.sum.column=field2 In this example the topdocs collector is the default. The delegating collector is the sum collector. The sum collector is being passed two parameters groupby and column, telling it to groupby field1 and sum field2. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. A simple example delegating collector to test the entire flow needs to be created. Configurable Collectors --- Key: SOLR-4465 URL: https://issues.apache.org/jira/browse/SOLR-4465 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.1 Reporter: Joel Bernstein Fix For: 4.3 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch This issue is to add configurable custom collectors to Solr. This expands the design and work done in issue SOLR-1680 to include: 1) CollectorFactory configuration in solconfig.xml 2) Http parameters to allow clients to dynamically select a CollectorFactory and construct a custom Collector. 3) Make aspects of QueryComponent pluggable so that the output from distributed search can conform with custom collectors at the shard level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4826) PostingsHighlighter doesn't keep the top N best scoring passages
[ https://issues.apache.org/jira/browse/LUCENE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-4826. Resolution: Fixed PostingsHighlighter doesn't keep the top N best scoring passages Key: LUCENE-4826 URL: https://issues.apache.org/jira/browse/LUCENE-4826 Project: Lucene - Core Issue Type: Bug Components: modules/highlighter Reporter: Michael McCandless Fix For: 5.0, 4.3 Attachments: LUCENE-4826.patch The comparator we pass to the PQ is just backwards ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600410#comment-13600410 ] Joel Bernstein edited comment on SOLR-4465 at 3/12/13 8:27 PM: --- Added support for delegating collectors. This design specifies a single topdocs collector and any number of delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. Accordingly there are two collector parameters: cl.topdocs=topdocs collector name cl.delegating=comma separated list of delegating collectors Both of these parameters refer to collectorFactories configured in the solrconfig.xml. Parameters are passed to the collectors by name. For example: cl.topdocs=defaultcl.delegating=sumcl.sum.groupby=field1cl.sum.column=field2 In this example the topdocs collector is the default collector. The delegating collector is the sum collector. Both of these refer to named collectorFactories in solrconfig.xml. The sum collector is being passed two parameters groupby and column, telling it to groupby field1 and sum field2. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. A simple example delegating collector to test the entire flow needs to be created. was (Author: joel.bernstein): Added support for delegating collectors. This design allows for a topdocs collector to be wrapped by delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. Now there are two collector parameters: cl.topdocs=topdocs collector name cl.delegating=comma separated list of delegating collectors Both of these parameters refer to collectorFactories configured in the solrconfig.xml. Parameters are passed to the collectors by name. For example: cl.topdocs=defaultcl.delegating=sumcl.sum.groupby=field1cl.sum.column=field2 In this example the topdocs collector is the default collector. The delegating collector is the sum collector. Both of these refer to named collectorFactories in solrconfig.xml. The sum collector is being passed two parameters groupby and column, telling it to groupby field1 and sum field2. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. A simple example delegating collector to test the entire flow needs to be created. Configurable Collectors --- Key: SOLR-4465 URL: https://issues.apache.org/jira/browse/SOLR-4465 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.1 Reporter: Joel Bernstein Fix For: 4.3 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch This issue is to add configurable custom collectors to Solr. This expands the design and work done in issue SOLR-1680 to include: 1) CollectorFactory configuration in solconfig.xml 2) Http parameters to allow clients to dynamically select a CollectorFactory and construct a custom Collector. 3) Make aspects of QueryComponent pluggable so that the output from distributed search can conform with custom collectors at the shard level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-4827) don't hardcode PostingsHighlighter scoring parameters
Robert Muir created LUCENE-4827: --- Summary: don't hardcode PostingsHighlighter scoring parameters Key: LUCENE-4827 URL: https://issues.apache.org/jira/browse/LUCENE-4827 Project: Lucene - Core Issue Type: Test Components: modules/highlighter Reporter: Robert Muir Attachments: LUCENE-4827.patch Tuning these parameters can be very useful if you want to tweak how sentences are ranked (e.g. you have a strangeish corpus like wikipedia). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4827) don't hardcode PostingsHighlighter scoring parameters
[ https://issues.apache.org/jira/browse/LUCENE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4827: Attachment: LUCENE-4827.patch patch without tests. Would be good to add a simple test that e.g. sets b=0 to disable passage length normalization or something. don't hardcode PostingsHighlighter scoring parameters - Key: LUCENE-4827 URL: https://issues.apache.org/jira/browse/LUCENE-4827 Project: Lucene - Core Issue Type: Test Components: modules/highlighter Reporter: Robert Muir Attachments: LUCENE-4827.patch Tuning these parameters can be very useful if you want to tweak how sentences are ranked (e.g. you have a strangeish corpus like wikipedia). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4827) don't hardcode PostingsHighlighter scoring parameters
[ https://issues.apache.org/jira/browse/LUCENE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600450#comment-13600450 ] Michael McCandless commented on LUCENE-4827: +1 don't hardcode PostingsHighlighter scoring parameters - Key: LUCENE-4827 URL: https://issues.apache.org/jira/browse/LUCENE-4827 Project: Lucene - Core Issue Type: Test Components: modules/highlighter Reporter: Robert Muir Attachments: LUCENE-4827.patch Tuning these parameters can be very useful if you want to tweak how sentences are ranked (e.g. you have a strangeish corpus like wikipedia). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4825) PostingsHighlighter support for positional queries
[ https://issues.apache.org/jira/browse/LUCENE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600459#comment-13600459 ] Robert Muir commented on LUCENE-4825: - Also I think the most efficient way to add this (though its all in a branch i think?) would be to add a IntervalHighlighter. This would work with all queries i think, without the current complexity of rewriting things and so on. PostingsHighlighter support for positional queries -- Key: LUCENE-4825 URL: https://issues.apache.org/jira/browse/LUCENE-4825 Project: Lucene - Core Issue Type: Improvement Components: modules/highlighter Affects Versions: 4.2 Reporter: Luca Cavanna I've been playing around with the brand new PostingsHighlighter. I'm really happy with the result in terms of quality of the snippets and performance. On the other hand, I noticed it doesn't support positional queries. If you make a span query, for example, all the single terms will be highlighted, even though they haven't contributed to the match. That reminds me of the difference between the QueryTermScorer and the QueryScorer (using the standard Highlighter). I've been trying to adapt what the QueryScorer does, especially the extraction of the query terms together with their positions (what WeightedSpanTermExtractor does). Next step would be to take that information into account within the formatter and highlight only the terms that actually contributed to the match. I'm not quite ready yet with a patch to contribute this back, but I certainly intend to do so. That's why I opened the issue and in the meantime I would like to hear what you guys think about it and discuss how best we can fix it. I think it would be a big improvement for this new highlighter, which is already great! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4557) Fix broken CoreContainerTest.testReload
[ https://issues.apache.org/jira/browse/SOLR-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-4557. -- Resolution: Fixed Fix Version/s: 5.0 4.3 tightened up the core sequencing operations load/unload/reload now are sequential for any individual core. Operations happen in parallel for different cores of course. 4x:r - 1455710 trunk: r - 1455606 Fix broken CoreContainerTest.testReload --- Key: SOLR-4557 URL: https://issues.apache.org/jira/browse/SOLR-4557 Project: Solr Issue Type: Test Affects Versions: 4.2, 5.0 Reporter: Erick Erickson Assignee: Erick Erickson Fix For: 4.3, 5.0 Attachments: SOLR-4557.patch, SOLR-4557.patch, SOLR-4557_posthshutdown_stack.txt I was chasing down a test failure, and it turns out that CoreContainerTest.testReload has only succeeded by chance. The test fires up 4 threads that go out and reload the same core all at once. This caused me to look at properly synchronizing reloading cores pursuant to SOLR-4196, on the theory that we should serialize loading, unloading and reloading cores; we shouldn't be doing _any_ of those operations from different threads on the same core at the same time. It turns out that if you fire up multiple reloads at once without serializing them, an error is thrown instead of proper reloading occurring, and that's the only reason the test doesn't hang. The stack trace of the exception is below for reference, but it doesn't with the code I'll attach to this patch: [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:427) [junit4:junit4] 2 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:415) [junit4:junit4] 2 at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:536) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:138) [junit4:junit4] 2 at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.register(RequestHandlers.java:106) [junit4:junit4] 2 at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:157) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.init(SolrCore.java:757) [junit4:junit4] 2 at org.apache.solr.core.SolrCore.reload(SolrCore.java:408) [junit4:junit4] 2 at org.apache.solr.core.CoreContainer.reload(CoreContainer.java:1076) [junit4:junit4] 2 at org.apache.solr.core.TestCoreContainer$1TestThread.run(TestCoreContainer.java:90) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4465) Configurable Collectors
[ https://issues.apache.org/jira/browse/SOLR-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600410#comment-13600410 ] Joel Bernstein edited comment on SOLR-4465 at 3/12/13 9:16 PM: --- Added support for delegating collectors. This design specifies a single topdocs collector and any number of delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. Accordingly there are two collector parameters: cl.topdocs=topdocs collector name cl.delegating=comma separated list of delegating collectors Both of these parameters refer to collectorFactories configured in the solrconfig.xml. Parameters are passed to the collectors by name. For example: cl.topdocs=defaultcl.delegating=sumcl.sum.groupby=field1cl.sum.column=field2 In this example the topdocs collector is the default collector. The delegating collector is the sum collector. Both of these refer to named collectorFactories in solrconfig.xml. The sum collector is being passed two parameters groupby and column, telling it to groupby field1 and sum field2. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with search results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. A simple example delegating collector to test the entire flow needs to be created. was (Author: joel.bernstein): Added support for delegating collectors. This design specifies a single topdocs collector and any number of delegating collectors. The topdocs collector collects the doclist and docset. The delegating collectors are designed to collect aggregate data of some kind. Accordingly there are two collector parameters: cl.topdocs=topdocs collector name cl.delegating=comma separated list of delegating collectors Both of these parameters refer to collectorFactories configured in the solrconfig.xml. Parameters are passed to the collectors by name. For example: cl.topdocs=defaultcl.delegating=sumcl.sum.groupby=field1cl.sum.column=field2 In this example the topdocs collector is the default collector. The delegating collector is the sum collector. Both of these refer to named collectorFactories in solrconfig.xml. The sum collector is being passed two parameters groupby and column, telling it to groupby field1 and sum field2. The delegating collectors have access to the ResponseBuilder and through that can add Maps directly to the SolrQueryResponse. Both the topdocs collector and the delegating collectors take part in the merge of distributed results from shards. This paves the way for pluggable distributed analytics to be included with searches results. TODO: I believe Maps that are placed in the SolrQueryResponse are automatically output but some work needs to be done get them read in the solrj QueryResponse class so they can be merged. A simple example delegating collector to test the entire flow needs to be created. Configurable Collectors --- Key: SOLR-4465 URL: https://issues.apache.org/jira/browse/SOLR-4465 Project: Solr Issue Type: New Feature Components: search Affects Versions: 4.1 Reporter: Joel Bernstein Fix For: 4.3 Attachments: SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch, SOLR-4465.patch This issue is to add configurable custom collectors to Solr. This expands the design and work done in issue SOLR-1680 to include: 1) CollectorFactory configuration in solconfig.xml 2) Http parameters to allow clients to dynamically select a CollectorFactory and construct a custom Collector. 3) Make aspects of QueryComponent pluggable so that the output from distributed search can conform with custom collectors at the shard level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3755) shard splitting
[ https://issues.apache.org/jira/browse/SOLR-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600489#comment-13600489 ] Mark Miller commented on SOLR-3755: --- I think collection might be a better param name than name for the shard split api shard splitting --- Key: SOLR-3755 URL: https://issues.apache.org/jira/browse/SOLR-3755 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: Yonik Seeley Attachments: SOLR-3755-combined.patch, SOLR-3755-combinedWithReplication.patch, SOLR-3755-CoreAdmin.patch, SOLR-3755.patch, SOLR-3755.patch, SOLR-3755-testSplitter.patch, SOLR-3755-testSplitter.patch We can currently easily add replicas to handle increases in query volume, but we should also add a way to add additional shards dynamically by splitting existing shards. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Javadoc bug for Lucene TokenFilter
The Lucene Javadoc for TokenFilter shows only a single Direct Known Subclass when in fact there are dozens of them. The Lucene JavaDoc for LowerCaseFilter does in fact show TokenFilter as it’s direct parent class even though the Javadoc for TokenFilter does not report LowerCaseFilter as a Direct Known Subclass. Is this any good reason for this discrepancy, or is this simply a bug in either Lucene’s packaging or the javadoc generation? 4.0, 4.1, and 4.2 all have consistent behavior, but 3.6 reports a long list of the expected subclasses. I suspect it may have to do with the fact that the subclasses are off in a separate folder from the parent class. See: http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/TokenFilter.html http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/core/LowerCaseFilter.html http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/analysis/TokenFilter.html Note: The Lucene Javadoc for TokenFilterFactory does in fact show dozens of Direct Known Subclasses, as expected. -- Jack Krupansky