null:java.lang.ArrayIndexOutOfBoundsException with solr-trunk
Hi All, I am indexing aspx files into solr-trunk (using ManifoldCF). And I am getting below Exception in a pretty much random manner. solr-spec : 5.0.0.2012.08.16.22.19.11 solr-impl : 5.0-SNAPSHOT exported - iorixxx - 2012-08-16 22:19:11 lucene-spec : 5.0-SNAPSHOT lucene-impl : 5.0-SNAPSHOT exported - iorixxx - 2012-08-16 22:16:00 When i downgrade to below version, everything works fine. solr-spec : 5.0.0.2012.07.19.18.36.06 solr-impl : 5.0-SNAPSHOT exported - iorixxx - 2012-07-19 18:36:06 lucene-spec : 5.0-SNAPSHOT lucene-impl : 5.0-SNAPSHOT exported - iorixxx - 2012-07-19 18:35:10 Aug 17, 2012 10:12:46 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.ArrayIndexOutOfBoundsException at java.lang.System.arraycopy(Native Method) at org.apache.solr.common.util.FastOutputStream.write(FastOutputStream.java:79) at org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:470) at org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:545) at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:232) at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:149) at org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:388) at org.apache.solr.update.TransactionLog.write(TransactionLog.java:340) at org.apache.solr.update.UpdateLog.add(UpdateLog.java:326) at org.apache.solr.update.UpdateLog.add(UpdateLog.java:311) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:229) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:414) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:535) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:315) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:123) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:121) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:126) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:233) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1658) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:454) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:275) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) at org.eclipse.jetty.server.Server.handle(Server.java:351) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900) at
about example-DIH
Hello, In solr-trunk/solr/exampleREADME.txt it says java -Dsolr.solr.home=example-DIH but it should be java -Dsolr.solr.home=example-DIH/solr (it is correct in example-DIH/README.txt) When execute full-import on mail core, I get this : ( I am note sure if mail core needs some extra jars) Caused by: java.lang.ClassNotFoundException: org.apache.tika.Tika at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) Somehow core tika's Dataimport link seems does not working. Weird thing is other cores' works. (tested in firefox and safari) db, rss and solr cores have admin-extra.html while tika and mail don't. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: about example-DIH
In solr-trunk/solr/exampleREADME.txt it says java -Dsolr.solr.home=example-DIH but it should be java -Dsolr.solr.home=example-DIH/solr (it is correct in example-DIH/README.txt) When execute full-import on mail core, I get this : ( I am note sure if mail core needs some extra jars) Caused by: java.lang.ClassNotFoundException: org.apache.tika.Tika at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) I created SOLR-3759 for these two issues above. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: about example-DIH
Somehow core tika's Dataimport link seems does not working. Weird thing is other cores' works. (tested in firefox and safari) It seems that requestHandler name=/admin/ class=org.apache.solr.handler.admin.AdminHandlers / is required for UI. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
UUIDField uniqueKey with default=NEW
Hi all, I was following http://wiki.apache.org/solr/UniqueKey#UUID_techniques to setup uuid as my uniqueKey. (recent solr-trunk) fieldType name=uuid class=solr.UUIDField indexed=true / field name=uniqueKey type=uuid indexed=true stored=true default=NEW required=true / uniqueKeyuniqueKey/uniqueKey I get the following exception. SEVERE: null:org.apache.solr.common.SolrException: uniqueKey field (null) can not be configured with a default value (NEW) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:496) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:113) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:851) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:539) I made this working by adding some if checks to IndexSchema.java and UpdateCommand.java. getType().getClass().getName().equals(UUIDField.class.getName() But I am not sure if this is preferred way. How can I use uuid as my uniqueKey without source code modification? Thanks, - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: UUIDField uniqueKey with default=NEW
You're trying to use a feature that was removed from trunk/4x by SOLR-2796: AddUpdateCommand.getIndexedId doesn't work with schema configured defaults/copyField - UUIDField/copyField can not be used as uniqueKey field. See: https://issues.apache.org/jira/browse/SOLR-2796 This revision: http://svn.apache.org/viewvc?view=revisionrevision=1345378 Evidently the wiki was not corrected to note that the feature was removed. Thanks Jack that was helpful! So in order to use uuid as uniqueKey update processor chain is the way to go. There are two ways to do it. 1) field name=uniqueKey type=uuid indexed=true stored=true required=true / updateRequestProcessorChain name=default-values processor class=solr.DefaultValueUpdateProcessorFactory str name=fieldNameuniqueKey/str str name=valueNEW/str /processor processor class=solr.RunUpdateProcessorFactory / /updateRequestProcessorChain 2) field name=uniqueKey type=string indexed=true stored=true required=true / updateRequestProcessorChain name=uuid processor class=solr.UUIDUpdateProcessorFactory str name=fieldNameuniqueKey/str /processor processor class=solr.RunUpdateProcessorFactory / /updateRequestProcessorChain correct? I will try to update the wiki. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: UUIDField uniqueKey with default=NEW
: processor class=solr.DefaultValueUpdateProcessorFactory : str name=fieldNameuniqueKey/str : str name=valueNEW/str : /processor ...that approach won't work, it still relies on the UUIDField class accepting NEW as input to generate a new key, and that is no longer supported -- it happens to late in the processing for it to be used as the unique key) I tested this approach (At revision 1379678) and it seems working. I can see generated values. e.g. str name=uniqueKeya259aa91-353f-4824-9f68-01837b721cf7/str you may want to primarily describe the UUIDUpdateProcessorFactory as how to generate a UUID for new documents, and then as a closing comment mention that in Solr 3, the default=NEW approach can be used instead. Thanks for the pointer, I will try do it. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: UUIDField uniqueKey with default=NEW
Hmmm... on a single node instance it might work -- You are correct it was a single note setup. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
SolrPluginUtils.docListToSolrDocumentList loads all stored fields
Hello, Regardless of SetString fields parameter, SolrPluginUtils#docListToSolrDocumentList method loads all of the stored fields. Shouldn't it just load the fields given in the set? Should I file a jira ticket? When small bug in TestCase is seen what is the preffered way to inform it? Open an issue or tell here? Example: In SolrPluginUtilsTest.testDocListConversion method, for loop is not executed because list.size() = 0. commit should be inside the assertU(), and cmd.setLen() should be called. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
phps with SolrDocument
JSONWriter.writeSolrDocument method calls writeMapOpener(-1); However its subclass (PHPSerilalizedWriter) throws an exception in its writeMapOpener method if size0. This makes impossible to use SolrDocumentList as response with phps. (somehow related to solr-2291) Here is a snippet demonstrates the case. public class PHPSTest extends SolrTestCaseJ4 { @BeforeClass public static void beforeClass() throws Exception { initCore(solrconfig.xml, schema.xml); } @Test public void testPHPS() throws Exception { SolrQueryRequest req = req(CommonParams.WT, phps); SolrQueryResponse rsp = new SolrQueryResponse(); PHPSerializedResponseWriter w = new PHPSerializedResponseWriter(); SetString returnFields = new HashSetString(1); returnFields.add(id); returnFields.add(score); rsp.setReturnFields(returnFields); StringWriter buf = new StringWriter(); SolrDocument solrDoc = new SolrDocument(); solrDoc.addField(id, 1); solrDoc.addField(subject, hello2); solrDoc.addField(title, hello3); solrDoc.addField(score, 0.7); SolrDocumentList list = new SolrDocumentList(); list.setNumFound(1); list.setStart(0); list.setMaxScore(0.7f); list.add(solrDoc); rsp.add(response, list); w.write(buf, req, rsp); System.out.println(buf.toString()); req.close(); } } - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Invalid version (expected 2, but 60) or the data in not in 'javabin' format
Hi, I am was hitting the following exception when doing distributed search. I am faceting on an int field named contentID. For some queries it was giving this error. For some queries it just works fine. localhost:8080/solr/kanu/select/?shards=localhost:8080/solr/rega,localhost:8080/solr/kanuindent=trueq=kararstart=0rows=15hl=falsewt=xmlfacet=truefacet.limit=-1facet.sort=falsejson.nl=arrarrfq=isXml:falsemm=100%facet.field=contentIDf.contentID.facet.mincount=2 Same search URL works fine for cores (kanu and rega) individually. Plus if I use rega core as base search URL it works too. e.g. localhost:8080/solr/rega/select/?shards=localhost:8080... I see that rega core has lots of unique values for contentID field. So my conclusion is, if a shard response is too big this happens. This is a bad usage of faceting and I will remove faceting on that field since it was added accidentally. I still want to share stack traces since error message is somehow misleading. Jan 21, 2013 10:36:53 PM org.apache.solr.common.SolrException log SEVERE: null:org.apache.solr.common.SolrException: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:300) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1701) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 60) or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:109) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:166) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:133) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) ... 1 more When I add shards.tolerant=true exception becomes: Jan 21, 2013 10:51:51 PM org.apache.solr.common.SolrException log SEVERE: null:java.lang.NullPointerException at org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:967) at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:630) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:605) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:309) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1701) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:455) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:276) at
Re: Wild card support when stemmers are added
Hi, You have to separate options where you keep both accessorising and accessorise at index. 1) https://issues.apache.org/jira/browse/SOLR-3231 2) Create a un-stemmed field and run wildcard queries against it too. --- On Tue, 2/5/13, msreddy.hi msreddy...@gmail.com wrote: From: msreddy.hi msreddy...@gmail.com Subject: Re: Wild card support when stemmers are added To: dev@lucene.apache.org Date: Tuesday, February 5, 2013, 11:40 AM Thanks Jack. I will look at the option of implementing work around. --Saida Reddy. -- View this message in context: http://lucene.472066.n3.nabble.com/Wild-card-support-when-stemmers-are-added-tp4038402p4038511.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VOTE: RC0 Release apache-solr-ref-guide-4.6.pdf
Hi, On page 293 : rm -r shard*/solr/zoo_data should be rm -r node*/solr/zoo_data On page 297 : ... shard, an d forwards ... should be ... shard, and forwards ... Thanks, Ahmet On Wednesday, November 27, 2013 2:47 PM, Cassandra Targett casstarg...@gmail.com wrote: I noticed a couple of small typos and inconsistencies that I've fixed, but I don't think they warrant a respin. They're more for appearance than for any factual problems. +1 Sorry for the delay from me - I've been traveling for holidays. On Tue, Nov 26, 2013 at 4:22 AM, Jan Høydahl jan@cominvent.com wrote: * Page 5: Screenshots with 4.0.0-beta texts * Page 165: Links to 4.0.0 version of JavaDoc (now fixed in Confluence) * Page 204: Table - group.func - Supported only in Sol4r 4.0. (should be Supported since Solr 4.0.) (now fixed in Confluence) * Page 308: Strange xml code box layout, why all the whitespace? But these are minors, so here's my +1 -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 25. nov. 2013 kl. 19:34 skrev Chris Hostetter hossman_luc...@fucit.org: Please VOTE to release the following as apache-solr-ref-guide-4.6.pdf ... https://dist.apache.org/repos/dist/dev/lucene/solr/ref-guide/apache-solr-ref-guide-4.6-RC0/ $ cat apache-solr-ref-guide-4.6.pdf.sha1 7ad494c5a3cdc085e01a54d507ae33a75cc319e6 apache-solr-ref-guide-4.6.pdf -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Can't seem to build in order to run unit tests from IntelliJ any more.
Hi Erick, Same here, I get this error: import org.apache.lucene.queries.CommonTermsQuery; Cannot resolve CommonTermsQuery --- On Sun, 2/17/13, Erick Erickson erickerick...@gmail.com wrote: From: Erick Erickson erickerick...@gmail.com Subject: Can't seem to build in order to run unit tests from IntelliJ any more. To: dev@lucene.apache.org Date: Sunday, February 17, 2013, 7:22 PM Anyone else having problems here? I've deleted the ivy cache, cleaned the idea project (and everything else), tried it on a fresh checkout. What am I missing? The problem is that classes are not found, I see messages in IntelliJ like: java: package org.apache.lucene.analysis does not exist java: cannot find symbol symbol: class Query location: package org.apache.lucene.search etc. I can run things from the command line just fine Did something move? Thanks,Erick
Re: Can't seem to build in order to run unit tests from IntelliJ any more.
Hi Steve, Thanks for the fix, I can run TestCases using intelliJ now. Ahmet --- On Mon, 2/18/13, Steve Rowe sar...@gmail.com wrote: From: Steve Rowe sar...@gmail.com Subject: Re: Can't seem to build in order to run unit tests from IntelliJ any more. To: dev@lucene.apache.org Date: Monday, February 18, 2013, 10:04 AM Hi Arslan, I just committed a fix for this particular problem (missing queries module dependency from the highlighter module). I think Erick is having a different problem, not sure what yet. Steve On Feb 18, 2013, at 2:41 AM, Ahmet Arslan iori...@yahoo.com wrote: Hi Erick, Same here, I get this error: import org.apache.lucene.queries.CommonTermsQuery; Cannot resolve CommonTermsQuery --- On Sun, 2/17/13, Erick Erickson erickerick...@gmail.com wrote: From: Erick Erickson erickerick...@gmail.com Subject: Can't seem to build in order to run unit tests from IntelliJ any more. To: dev@lucene.apache.org Date: Sunday, February 17, 2013, 7:22 PM Anyone else having problems here? I've deleted the ivy cache, cleaned the idea project (and everything else), tried it on a fresh checkout. What am I missing? The problem is that classes are not found, I see messages in IntelliJ like: java: package org.apache.lucene.analysis does not exist java: cannot find symbol symbol: class Query location: package org.apache.lucene.search etc. I can run things from the command line just fine Did something move? Thanks, Erick - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Subscribe to the mailing list
Hi Abhishek, Since you send this e-mail, you are successfully subscribed to dev list. Please see the how to contribute wiki pages http://wiki.apache.org/lucene-java/HowToContribute http://wiki.apache.org/solr/HowToContribute Welcome and happy hacking, Ahmet On Monday, March 10, 2014 1:56 PM, Abhishek Shah igeniuss...@gmail.com wrote: Hi I wanted to subscribe to this mailing list and wanted to contribute in development of lucene. -- Regards, Abhishek Shah - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Reducing the number of warnings in the codebase
Hi Shawn, +1 for the idea, we should take full advantage of Eclipse, IntelliJ etc. Here are some relevant tickets created by Furkan. https://issues.apache.org/jira/browse/LUCENE-5506 https://issues.apache.org/jira/browse/SOLR-5838 https://issues.apache.org/jira/browse/SOLR-5839 I believe https://issues.apache.org/jira/browse/SOLR-5685 could expressed as an automatic rule or something. There is already similar thing to detect usage of String.uppercase/lowercase without locale. And StringBuffer versus StringBuilder. Ahmet On Sunday, March 16, 2014 12:09 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi; I've run FindBugs for Lucene/Solr project. If you use Intellij IDEA you can group the warnings according to their importance. I've opened issues and attached patches for top level warnings/errors (and some others) that FindBugs found. On the other hand I have another suggestion for Lucene/Solr project. When I develop or lead projects I use Sonar. It's so good and it runs really nice open source projects to analyze your code. FindBugs, PMD, Jacoco are just some of them. It also calculates the method complexities, LoC and etc. You can see a live example from here: https://sonar.springsource.org/dashboard/index/4824 I can be volunteer to integrate Sonar into Lucene/Solr project. Thanks; Furkan KAMACI 2014-03-16 11:01 GMT+02:00 Shawn Heisey s...@elyograg.org: With the default settings in Eclipse, the Lucene/Solr codebase shows over 6000 warnings. This is the case for both branch_4x and trunk. I'm no expert, but this does seem a little excessive. If I were to take on the task of reducing this number, what advice can the group give me? Is there someone in particular that I should consider a resource for inevitable dumb questions? I haven't done an exhaustive survey, but I would imagine that most of them can be eliminated fairly easily. I'm fully aware that we may not be able to eliminate them all. One problem with fixing warnings is that the resulting patch(es) would be just as invasive as the recent work to move branch_4x to Java 7. This would complicate any ongoing work, especially large-scale work that is happening onchange-specific branches. A similar topic that may require a separate discussion: FindBugs. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Reducing the number of warnings in the codebase
Hi, Here are some rules : Following String methods where left hand side is empty. String.replace() String.toUpperCase() String.toLowerCase() String.replaceFirst() String.trim() In test cases (subblasses of SolrTestCaseJ4) methods without assertU(). see : SOLR-5685 adoc() optimize() commit() String.toUpperCase() and String.toLowerCase() without Locale. see : SOLR-2281 and LUCENE-2466 Can ant precommit/forbidden-apis be used to detect above? Ahmet On Sunday, March 16, 2014 9:53 PM, Benson Margulies bimargul...@gmail.com wrote: Just because some tool expresses distaste, doesn't imply that everyone here agrees that it's a problem we should fix. In my experience, the default Sonar rulesets contain many things that people here are prone to disagree with. Start with serialVersionUID: do we care? Why would we care? In what cases to we really believe that a sane person would be using Java serialization with a Lucene/Solr class? Sonar can also be a bit cranky; it arranges for various tools to run via mechanisms that sometimes conflict with the ways you might run them yourself. So I'd suggest a process like: 1. Someone proposes a set of (e.g.) checkstyle rules to live by. 2. That ruleset is refined by experiment. 3. We make violations fail the build. Then lather, rinse, repeat for other tools. Once we have rulesets we agree are worth enforcing, we can look to Sonar for a pretty way to visualize their results if we like. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Reducing the number of warnings in the codebase
Hi Uwe, I looked for definitions under lucene/tools/forbiddenApis/*.txt files but I couldn't find. Where are those rule are defined? I am wondering about the syntax, can you point? Thanks, Ahmet On Sunday, March 16, 2014 10:40 PM, Uwe Schindler u...@thetaphi.de wrote: String.toUpperCase() and String.toLowerCase() without Locale. see : SOLR- 2281 and LUCENE-2466 Those are already detected by forbidden-apis. Can ant precommit/forbidden-apis be used to detect above? Ahmet On Sunday, March 16, 2014 9:53 PM, Benson Margulies bimargul...@gmail.com wrote: Just because some tool expresses distaste, doesn't imply that everyone here agrees that it's a problem we should fix. In my experience, the default Sonar rulesets contain many things that people here are prone to disagree with. Start with serialVersionUID: do we care? Why would we care? In what cases to we really believe that a sane person would be using Java serialization with a Lucene/Solr class? Sonar can also be a bit cranky; it arranges for various tools to run via mechanisms that sometimes conflict with the ways you might run them yourself. So I'd suggest a process like: 1. Someone proposes a set of (e.g.) checkstyle rules to live by. 2. That ruleset is refined by experiment. 3. We make violations fail the build. Then lather, rinse, repeat for other tools. Once we have rulesets we agree are worth enforcing, we can look to Sonar for a pretty way to visualize their results if we like. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
TruncateTokenFilter FixedPrefixStemFilter
Hello, I would like to ask if there is an interest to add TruncateTokenFilter to lucene. I am using this filter as a stemmer for Turkish language. In many academic research (clustering, classification,retrieval) it is used and called as Fixed Prefix Stemmer or Simple Truncation Method or F5 in short. Among F3 TO F7, F5 stemmer (length=5) is found to work well for Turkish language in this [1]. It is the same work where some of stopwords_tr.txt are acquired. [1] Information Retrieval on Turkish Texts http://www.users.muohio.edu/canf/papers/JASIST2008offPrint.pdf ElasticSearch has this filter but it does not respect keyword attribute. Main advantage of F5 stemming is it does not effected by the meaning loss caused by ascii folding. It work well with ascii folding. [2] Effects of diacritics on Turkish information retrieval http://journals.tubitak.gov.tr/elektrik/issues/elk-12-20-5/elk-20-5-9-1010-819.pdf Here is the full type I use for customers fieldType name=text_tr_ascii_f5 class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ApostropheFilterFactory/ filter class=solr.TurkishLowerCaseFilterFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.KeywordRepeatFilterFactory/ filter class=solr.TruncateTokenFilterFactory prefixLength=5/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer I would like to get community opinions on : 1) interest in this? Should I create a jira issue and attach what I have got 2) keyword attribute should be respected? 3) package name analysis.misc versus analyis.tr 4) name of the class TruncateTokenFilter versus FixedPrefixStemFilter Thanks, Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene / Solr 4.7.1 RC2
+1 SUCCESS! [1:28:08.851424] Ahmet On Saturday, March 29, 2014 10:46 AM, Steve Rowe sar...@gmail.com wrote: Please vote for the second Release Candidate for Lucene/Solr 4.7.1. Download it here: https://people.apache.org/~sarowe/staging_area/lucene-solr-4.7.1-RC2-rev1582953/ Smoke tester cmdline (from the lucene_solr_4_7 branch): python3.2 -u dev-tools/scripts/smokeTestRelease.py \ https://people.apache.org/~sarowe/staging_area/lucene-solr-4.7.1-RC2-rev1582953/ \ 1582953 4.7.1 /tmp/4.7.1-smoke The smoke tester passed for me: SUCCESS! [0:50:29.936732] My vote: +1 Steve - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
ant test-help does not describe tests.disableHdfs
Hello all, Shouldn't 'ant test-help' talk about -Dtests.disableHdfs=true -Dtests.badapples=false parameters? Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene / Solr 4.7.2 (take two)
+1 SUCCESS! [1:56:54.132500] On Friday, April 11, 2014 10:22 PM, Adrien Grand jpou...@gmail.com wrote: +1 SUCCESS! [1:10:37.098259] On Fri, Apr 11, 2014 at 9:05 PM, Anshum Gupta ans...@anshumgupta.net wrote: +1 SUCCESS! [0:57:06.986265] On Thu, Apr 10, 2014 at 7:51 AM, Robert Muir rcm...@gmail.com wrote: artifacts are here: http://people.apache.org/~rmuir/staging_area/lucene_solr_4_7_2_r1586229/ here is my +1 SUCCESS! [0:46:25.014499] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Anshum Gupta http://www.anshumgupta.net -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
why sha1?
Hello, http://www.us.apache.org/dist/lucene/solr/4.7.2/ uses .sha1 suffix and it looks like uses SHA1. According to http://www.apache.org/dev/release-signing.html#basic-facts * An SHA checksum should also be created and must be suffixed .sha * SHA-1 should be avoided Is this something overlooked? Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC1
+1 SUCCESS! [1:50:28.179297] Ahmet On Tuesday, April 22, 2014 11:38 PM, Uwe Schindler u...@thetaphi.de wrote: Here is my own +1: SUCCESS! [2:09:05.595608] Solr tests passed once I raised file descriptor limit. So we should definitely fix this. I will try to reproduce in an isolated way and post stack traces. Uwe -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, April 22, 2014 8:47 PM To: dev@lucene.apache.org Subject: [VOTE] Lucene/Solr 4.8.0 RC1 Hi, I prepared the first release candidate of Lucene and Solr 4.8.0. The artifacts can be found here: = http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0- RC1-rev1589150/ It took a bit longer, because we had to fix some remaining bugs regarding NativeFSLockFactory, which did not work correctly and leaked file handles. I also updated the instructions about the preferred Java update versions. See also Mike's blog post: http://www.elasticsearch.org/blog/java-1-7u55-safe- use-elasticsearch-lucene/ Please check the artifacts and give your vote in the next 72 hrs. My +1 will hopefully come a little bit later because Solr tests are failing constantly on my release build and smoke tester machine. The reason: it seems to be lack of file handles. A standard Ubuntu configuration has 1024 file handles and I want a release to pass with that common default configuration. Instead, org.apache.solr.cloud.TestMiniSolrCloudCluster.testBasics fails always with crazy error messages (not about too less file handles, more that Jetty cannot start up or not bind ports or various other stuff). This did not happen on smoking 4.7.x releases. I will run now the smoker again without HDFS (via build.properties) and if that also fails then once again with more file handles. But we really have to fix our tests that they succeed with the default config of 1024 file handles. We can configure that in Jenkins (so the Jenkins job first sets and then runs ANT ulimit -n 1024). But this should not block the release, I just say: I gave up running those Solr tests, sorry! Anybody else can test that stuff! Uwe P.S.: Here's my smoker command line: $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 python3.2 -u smokeTestRelease.py ' http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC1- rev1589150/' 1589150 4.8.0 tmp - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr 4.8.0 RC2
+1 SUCCESS! [1:46:42.182320] Ahmet On Friday, April 25, 2014 9:55 AM, Uwe Schindler u...@thetaphi.de wrote: Here is my +1: SUCCESS! [1:53:39.982427] Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Thursday, April 24, 2014 11:54 PM To: dev@lucene.apache.org Subject: [VOTE] Lucene/Solr 4.8.0 RC2 Hi, I prepared a second release candidate of Lucene and Solr 4.8.0. The artifacts can be found here: = http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0- RC2-rev1589874/ This RC contains the additional fixes for SOLR-6011, LUCENE-5626, and LUCENE-5630. Please check the artifacts and give your vote in the next 72 hrs. Uwe P.S.: Here's my smoker command line: $ JAVA_HOME=$HOME/jdk1.7.0_55 JAVA7_HOME=$HOME/jdk1.7.0_55 python3.2 -u smokeTestRelease.py ' http://people.apache.org/~uschindler/staging_area/lucene-solr-4.8.0-RC2- rev1589874/' 1589874 4.8.0 tmp - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: maxThreads in Jetty
bq. Someone wrote a nice response to this recently - Shawn I think? Its Uwe : http://search-lucene.com/m/WwzTb2XJeVJ1 On Saturday, April 26, 2014 6:59 PM, Mark Miller markrmil...@gmail.com wrote: Someone wrote a nice response to this recently - Shawn I think? The gist is, there are no current plans to move to Netty in 5. So far, 5 simply makes the http layer an implementation detail of Solr and stops promising a WAR. -- Mark Miller about.me/markrmiller On April 26, 2014 at 11:28:38 AM, Toke Eskildsen (t...@statsbiblioteket.dk) wrote: Otis Gospodnetic [otis.gospodne...@gmail.com] wrote: I think moving away from Jetty and going to Netty or something like that is on the radar for 5, no? That is my understanding, but the thread-issue is just as relevant for a non-Web-container setup: It is quite hard to allocate resources if there is no real limit on burst rate. - Toke Eskildsen - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene/Solr release 4.8.1
+1 SUCCESS! [1:49:01.551615] Ahmet P.S. Here's my smoker command line: python3 dev-tools/scripts/smokeTestRelease.py 'http://people.apache.org/~rmuir/staging_area/lucene_solr_4_8_1_r1594670/' 1594670 4.8.1 tmp On Friday, May 16, 2014 11:46 PM, Simon Willnauer simon.willna...@gmail.com wrote: +1 SUCCESS! [1:19:33.540237] ES seems to be happy as well simon On Fri, May 16, 2014 at 10:59 AM, Michael McCandless luc...@mikemccandless.com wrote: +1: SUCCESS! [0:39:08.550817] Mike McCandless http://blog.mikemccandless.com On Wed, May 14, 2014 at 8:58 PM, Robert Muir rcm...@gmail.com wrote: Hello, I have created a release candidate at http://people.apache.org/~rmuir/staging_area/lucene_solr_4_8_1_r1594670 Please test and vote. Here is my +1 vote. I smoketested and tried to break things over the past week during the mail outage. SUCCESS! [0:35:43.543536] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
add consumeAllTokens option to OffsetLimitTokenFilter
Hi, LimitTokenCountFilter and LimitTokenCountFilter have consumeAllTokens option. Any interest to add it to OffsetLimitTokenFilter too? It looks like there is a bug (SOLR-5426) in OffsetLimitTokenFilter that causes : end() called before incrementToken() returned false! Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Querying all docs
Hi, Could SOLR-5463 be used here? Ahmet On Tuesday, June 3, 2014 2:52 PM, Per Steffensen st...@designware.dk wrote: Thanks for responding On 03/06/14 10:32, Mikhail Khludnev wrote: On Tue, Jun 3, 2014 at 11:12 AM, Per Steffensen st...@designware.dk wrote: It is not desirable to set rows-param to e.g. MAX_VALUE, because I believe Solr will allocate memory dependent on the value of rows-param. not really. it reasonably limits it by maxdocs() https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L475 Yes I see. But I am not sure when reader.maxDocs is not just the number of docs available - we have way more than Integer.MAX_VALUE documents. * SegmentReader.maxDocs: si.info.getDocCount() * BaseCompositeReader.maxDocs: for (int i = 0; i subReaders.length; i++) { maxDoc += subReaders[i].maxDoc(); } The query I want to get all docs from, might hit 1k, 10k, 100k, 1m, ... , but never even close to Integer.MAX_VALUE. And I really do not like setting rows to something big enough, because I sure the next day someone tries to extract big enough+1 documents :-). I am sure no one will ever try to extract Integer.MAX_VALUE so that would be ok for big enough, but that just seems to use an unreasonable amount of memory. Solr and Lucene does not really suits for such all docs, which usually don't need scores and ranking, but Lucene always intended to allocate results heap for ranking. G, yes Deep paging, might help, but it's not the most achievable performance. see https://issues.apache.org/jira/browse/SOLR-5244 for some discussion, and prototype Thanks! I will definitely vote for that one. The thing I am working on here is actually some kind of export. -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: 4.9
Hi Robert, OffsetLimitTokenFilter(used internally by highlighter) has the following lines : public boolean incrementToken() throws IOException { -if (offsetCount offsetLimit input.incrementToken()) { +if (input.incrementToken() offsetCount offsetLimit) { Arun Kumar figured this out. Can you confirm this is truly a bug? His above solution, fixes all three : SOLR-3193 SOLR-3901 SOLR-5426. Thanks, Ahmet On Friday, June 13, 2014 4:56 AM, Robert Muir rcm...@gmail.com wrote: We have a pretty big release already with lots of good performance improvements. I'd like to release 4.9 soon, ill be RM. I'm thinking of spinning a RC in a week or so. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] 4.9.0
Hi, here is what I do * download solr-4.9.0.tgz * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/ * confirm they are loaded INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/icu4j-53.1.jar' to class loader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/lucene-analyzers-icu-4.9.0.jar' to classloader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/solr-analysis-extras-4.9.0.jar' to class loader icu4j-53.1.jar loaded twice INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4j-53.1.jar' to classloader * add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml * java -jar start.jar yields the exception reported in SOLR-6188 When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ is used everything works fine. Thanks, Ahmet On Friday, June 20, 2014 3:55 PM, Michael McCandless luc...@mikemccandless.com wrote: +1 SUCCESS! [0:47:26.115239] Mike McCandless http://blog.mikemccandless.com On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote: Artifacts here: http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/ Here's my +1 SUCCESS! [0:35:36.654925] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] 4.9.0
Hi, +1 SUCCESS! [1:47:26.786519] python3 dev-tools/scripts/smokeTestRelease.py 'http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/' 1604085 4.9.0 tmp Ahmet On Sunday, June 22, 2014 2:11 AM, Walter Underwood wun...@wunderwood.org wrote: Also, isn't JDK 7u51 a known bad release for Lucene? wunder On Jun 21, 2014, at 12:32 PM, Robert Muir rcm...@gmail.com wrote: Not *the* smoketester, instead some outdated arbitrary random smoketester from the past. please, use the latest one from the 4.9 branch. This file is supposed to be there and the smoketester actually looks for it. On Sat, Jun 21, 2014 at 3:16 PM, david.w.smi...@gmail.com david.w.smi...@gmail.com wrote: The smoke tester failed for me: lucene-solr_4x_svn$ python3.3 -u dev-tools/scripts/smokeTestRelease.py http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/ 1604085 4.9.0 /Volumes/RamDisk/tmp JAVA7_HOME is /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home NOTE: output encoding is UTF-8 Load release URL http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/;... Test Lucene... test basics... get KEYS 0.1 MB in 0.69 sec (0.2 MB/sec) check changes HTML... download lucene-4.9.0-src.tgz... 27.6 MB in 94.12 sec (0.3 MB/sec) verify md5/sha1 digests verify sig verify trust GPG: gpg: WARNING: This key is not certified with a trusted signature! download lucene-4.9.0.tgz... 61.7 MB in 226.09 sec (0.3 MB/sec) verify md5/sha1 digests verify sig verify trust GPG: gpg: WARNING: This key is not certified with a trusted signature! download lucene-4.9.0.zip... 71.3 MB in 217.32 sec (0.3 MB/sec) verify md5/sha1 digests verify sig verify trust GPG: gpg: WARNING: This key is not certified with a trusted signature! unpack lucene-4.9.0.tgz... verify JAR metadata/identity/no javax.* or java.* classes... test demo with 1.7... got 5727 hits for query lucene check Lucene's javadoc JAR unpack lucene-4.9.0.zip... verify JAR metadata/identity/no javax.* or java.* classes... test demo with 1.7... got 5727 hits for query lucene check Lucene's javadoc JAR unpack lucene-4.9.0-src.tgz... Traceback (most recent call last): File dev-tools/scripts/smokeTestRelease.py, line 1347, in module File dev-tools/scripts/smokeTestRelease.py, line 1291, in main File dev-tools/scripts/smokeTestRelease.py, line 1329, in smokeTest File dev-tools/scripts/smokeTestRelease.py, line 637, in unpackAndVerify File dev-tools/scripts/smokeTestRelease.py, line 708, in verifyUnpacked RuntimeError: lucene: unexpected files/dirs in artifact lucene-4.9.0-src.tgz: ['ivy-ignore-conflicts.properties'] And indeed, that file is there. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Walter Underwood wun...@wunderwood.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr and Maven
Hi Tom, you might find https://issues.apache.org/jira/browse/LUCENE-5755 relevant. Ahmet On Saturday, July 5, 2014 1:26 AM, Tom Chen tomchen1...@gmail.com wrote: Hi, The default tool to build Solr is ant ( plus ivy), while Maven support is provided. Regarding building with Maven, some questions: 1) Is there any difference between the build created by ant and that created by Maven? 2) Any plan for Solr to use Maven as the default building tool? Regards, Tom - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr EventListerner where to add the implementing classes
Hi Meena, OK, try this, delete/nuke all lib directives in solrconfigxml create a directory named lib under solrhome. where solr.xml lives in. this is supposed to be the most robust way of loading plugins. And this question better suited to user mailing list. Ahmet On Tuesday, January 13, 2015 1:20 AM, meena.sri...@mathworks.com meena.sri...@mathworks.com wrote: Thanks for your reply. I tried adding plugin and referenced to them in solr-config.xml file with no luck. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-EventListerner-where-to-add-the-implementing-classes-tp4178172p4179076.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr EventListerner where to add the implementing classes
Hi Meena, They are just like other plugins, please see how to load plugins section : https://wiki.apache.org/solr/SolrPlugins Ahmet On Thursday, January 8, 2015 9:16 PM, meena.sri...@mathworks.com meena.sri...@mathworks.com wrote: I am planning to implement the solr(4.9) EventListener interface to listen to the indexing event using DIH. document onImportStart =com.mathworks.brdb.indexer.StartIndexingEventListener onImportEnd=com.mathworks.brdb.indexer.EndIndexingEventListener I am not sure where to add these classes StartIndexingEventListener and EndIndexingEventListener so that solr could find them and do the necessary. Tried searching, but could not find a solution. Thanks Meena -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-EventListerner-where-to-add-the-implementing-classes-tp4178172.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: numberOfDocuments in SimilarityBase
Hi Robert, Thanks for chiming in, I created LUCENE-6711 for this. Ahmet On Thursday, July 30, 2015 4:47 PM, Robert Muir rcm...@gmail.com wrote: I think so. When adding this statistic (lucene 4.0), personally I really wanted to fix it everywhere. But we had the problem of backwards compatibility, and its bad to use different formulas for different segments even if it works... Nowadays we dont have lucene 3 segments around anymore, so I think we should fix this. Want to open an issue? On Wed, Jul 29, 2015 at 10:45 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hello List, SimilarityBase uses CollectionStatistics#maxDoc() for numberOfDocuments. Shouldn't it be field-based CollectionStatistics#docCount()? --- core/src/java/org/apache/lucene/search/similarities/SimilarityBase.java (revision 1693268) +++ core/src/java/org/apache/lucene/search/similarities/SimilarityBase.java (working copy) @@ -102,7 +102,7 @@ protected void fillBasicStats(BasicStats stats, CollectionStatistics collectionStats, TermStatistics termStats) { // #positions(field) must be = #positions(term) assert collectionStats.sumTotalTermFreq() == -1 || collectionStats.sumTotalTermFreq() = termStats.totalTermFreq(); -long numberOfDocuments = collectionStats.maxDoc(); +long numberOfDocuments = collectionStats.docCount(); Thanks, Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
numberOfDocuments in SimilarityBase
Hello List, SimilarityBase uses CollectionStatistics#maxDoc() for numberOfDocuments. Shouldn't it be field-based CollectionStatistics#docCount()? --- core/src/java/org/apache/lucene/search/similarities/SimilarityBase.java (revision 1693268) +++ core/src/java/org/apache/lucene/search/similarities/SimilarityBase.java (working copy) @@ -102,7 +102,7 @@ protected void fillBasicStats(BasicStats stats, CollectionStatistics collectionStats, TermStatistics termStats) { // #positions(field) must be = #positions(term) assert collectionStats.sumTotalTermFreq() == -1 || collectionStats.sumTotalTermFreq() = termStats.totalTermFreq(); -long numberOfDocuments = collectionStats.maxDoc(); +long numberOfDocuments = collectionStats.docCount(); Thanks, Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Writing custom Tokenizer
Hi Sid, Can you provide us more details? Usually you can get away without a custom tokenizer, there may be other tricks to achieve your requirements. Ahmet On Sunday, September 27, 2015 11:29 PM, Siddhartha Singh Sandhuwrote: Hi Everyone, I wanted to write a custom tokenizer and wanted a generic direction and some guidance on how I should go about achieving this goal. Your input will be much appreciated. Regards, Sid. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Writing custom Tokenizer
Hi Sid, One way is to use WhiteSpaceTokenizer and WordDelimeterFilter. In some cases you might want to adjust how WordDelimiterFilter splits on a per-character basis. To do this, you can supply a configuration file with the "types" attribute that specifies custom character categories. An example file is in subversion here. This is especially useful to add "hashtag or currency" searches. Please see: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory https://issues.apache.org/jira/browse/SOLR-2059 @ => ALPHA # => ALPHA P.S. Maintaining a custom tonizer will be a burden. It is done with *.jflex files blended with java files. Please see ClassicTokenizerImpl.jflex in the source tree for an example. Ahmet On Monday, September 28, 2015 1:58 AM, Siddhartha Singh Sandhu <sandhus...@gmail.com> wrote: Hi Ahmet, I want primarily 3 things. 1. To include # and @ as part of the string which is tokenized by the standard tokenizer which generally strips it off. 2. When a string is tokenized,I just want to keep tokens which are #tags and @mentions. 3. I understand there is PatternTokenizer but I wanted to leverage twitter-text github to because I trust there regex more then my own. Not only the above three, but I also need to control the special characters that are striped from my string while tokenizing. Please let me know of your views. Regards, Sid. On Sun, Sep 27, 2015 at 5:21 PM, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: Hi Sid, > >Can you provide us more details? > >Usually you can get away without a custom tokenizer, there may be other tricks >to achieve your requirements. > >Ahmet > > > > >On Sunday, September 27, 2015 11:29 PM, Siddhartha Singh Sandhu ><sandhus...@gmail.com> wrote: > > > >Hi Everyone, > >I wanted to write a custom tokenizer and wanted a generic direction and some >guidance on how I should go about achieving this goal. > >Your input will be much appreciated. > >Regards, > >Sid. > >- >To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discountOverlaps option for QueryParser
Hi Robert, As I understand, with SynonymQuery, all expansion is recommended to be performed on query time only, and SynonymQuery will take care of the below problem : "A query for text:TV will expand into (text:TV text:Television) and the lower docFreq for text:Television will give the documents that match "Television" a much higher score then docs that match "TV" comparably -- which may be somewhat counter intuitive to the client. Index time expansion (or reduction) will result in the same idf for all documents regardless of which term the original text contained." At the end of the query analysis, if there are tokens at the same position, I need to create my SynonymQuery programmatically, right? Let me explain my concern with another example: With above analyzer, the query "foo bör" will boost the term "bör" for no reason. Just because bör will be expanded into two terms : bor and bör. Its contribution to total score is counted two times. I think this is very trappy. With SynonymQuery solution, I will index with StandardTokenizer only. No expansion at index time. I will construct the query : new TermQuery('foo') + new SynonymQuery('bor', 'bör'); Thanks, Ahmet On Monday, September 21, 2015 12:33 AM, Robert Muir <rcm...@gmail.com> wrote: Hi Ahmet, maybe have a look at the SynonymQuery added in https://issues.apache.org/jira/browse/LUCENE-6789 For query-time synonyms, it just tries to approximate what happens if you instead do this work at index-time, by creating a "pseudo-term" (disjunction of all terms at that same position) summing up the term frequency across all matching terms before passing to score(). For the statistics side it takes the maximum DF as the representative DF, and the sum of the TTF as the representative TTF. I did relevance experiments with this and the results were positive over the existing query generated (BooleanQuery with coord disabled), especially for scoring systems that don't do anything with coord. On Sun, Sep 20, 2015 at 1:56 PM, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hello, > > Assume that term t1 is expanded into multiple terms (at the same position) > during both indexing and query time. > This is possible with KeywordRepeat, SynonymFilter, or the Filters that have > preserveOriginal option for instance. > > When a two-term query (t1 t2) is executed, term t1 is boosted artificially. > Score contribution of the term t1 is counted multiple times. > It is like the query were issued with boosts : t1^3 t2 > This behaviour boosts expanded terms and may not be always desired. > E.g. (When t2 is a content-bearing word) > > I think there should be a flag/switch which is analogous to relationship > between discountOverlaps & document's length. > With this control, overlapping query terms' (tokens with a position of > increment of zero) scores are counted once. > Remaining terms (not overlapping ones) are not affected. > > Bruno asked for this functionality in the past : > http://find.searchhub.org/document/bb99e435ba35f2b1 > > What do you think about this? How difficult to implement this? > Would this be a Lucene or Solr issue? > > Thanks, > Ahmet > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: discountOverlaps option for QueryParser
Hi Dough, Boosting exact matches is not my primary concern. By the way, ideal way to aggregate scores coming from different fields remains unclear. May be geometric mean is better than summing the field scores? I just want to warn people, if filters that produce multiple tokens at the same position are used carelessly, it can cause some un-obvious boostings in a query. Thanks, Ahmet On Monday, September 21, 2015 2:38 AM, Doug Turnbull <dturnb...@opensourceconnections.com> wrote: Another option Ahmet would be to create two fields, one that didn't do ASCII folding *without* preserving the original and another that did. The ASCII folded version is a less exacting representation of the text, and the version without ASCII folding would be more exacting My first pass at a solution to your problem would summing the two fields scores. Scoring the ASCII folded field provides a higher recall signal. I'll call this the "base score." Scoring the non-ASCII folded provides a more precise ranking signal. It kicks in only when the searcher types the exact non ASCII folded term in. In a sense it acts like how most people think of a boost: bonus points for harder to meet but valuable criteria. In other words, if you match on just bor, you just get the base score. If you match on bör you'd gain the benefit of the base and the additional boost scores. The more exacting, non ASCII folded version of the field acts as a boost. On the other hand, if you don't care to differentiate between a match on an ASCII folded or non-folded version, than simply create the base ASCII folded field and score against that. Shameless plug, this is exactly the sort of thing we talk quite a bit about in John Berryman's and my book, Relevant Search (http://manning.com/turnbull). You might find it useful. Cheers -Doug On Sunday, September 20, 2015, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: Hi Robert, > >As I understand, with SynonymQuery, all expansion is recommended to be >performed on query time only, >and SynonymQuery will take care of the below problem : > >"A query for text:TV will expand into (text:TV text:Television) and the lower >docFreq for text:Television will give the documents that match "Television" a >much higher score then docs that match "TV" comparably -- which may be >somewhat counter intuitive to the client. Index time expansion (or reduction) >will result in the same idf for all documents regardless of which term the >original text contained." > > >At the end of the query analysis, if there are tokens at the same position, I >need to create my SynonymQuery programmatically, right? > > >Let me explain my concern with another example: > > > > > > > >With above analyzer, the query "foo bör" will boost the term "bör" for no >reason. >Just because bör will be expanded into two terms : bor and bör. >Its contribution to total score is counted two times. I think this is very >trappy. > >With SynonymQuery solution, I will index with StandardTokenizer only. >No expansion at index time. >I will construct the query : new TermQuery('foo') + new SynonymQuery('bor', >'bör'); > >Thanks, >Ahmet > > > > >On Monday, September 21, 2015 12:33 AM, Robert Muir <rcm...@gmail.com> wrote: >Hi Ahmet, maybe have a look at the SynonymQuery added in >https://issues.apache.org/jira/browse/LUCENE-6789 > >For query-time synonyms, it just tries to approximate what happens if >you instead do this work at index-time, by creating a "pseudo-term" >(disjunction of all terms at that same position) summing up the term >frequency across all matching terms before passing to score(). For the >statistics side it takes the maximum DF as the representative DF, and >the sum of the TTF as the representative TTF. > >I did relevance experiments with this and the results were positive >over the existing query generated (BooleanQuery with coord disabled), >especially for scoring systems that don't do anything with coord. > > >On Sun, Sep 20, 2015 at 1:56 PM, Ahmet Arslan <iori...@yahoo.com.invalid> >wrote: >> Hello, >> >> Assume that term t1 is expanded into multiple terms (at the same position) >> during both indexing and query time. >> This is possible with KeywordRepeat, SynonymFilter, or the Filters that have >> preserveOriginal option for instance. >> >> When a two-term query (t1 t2) is executed, term t1 is boosted artificially. >> Score contribution of the term t1 is counted multiple times. >> It is like the query were issued with boosts : t1^3 t2 >> This behaviour boosts expanded terms and may not be always desired. >> E.g. (When t2 is a content-bearing word) >> >>
checkJavadocLinks.py fails with Python 3.5.0
Hi, In effort to run "ant precommit" I have installed Python 3.5.0. However, it fails with the following : [exec] File "/Volumes/data/workspace/solr-trunk/dev-tools/scripts/checkJavadocLinks.py", line 20, in [exec] from html.parser import HTMLParser, HTMLParseError [exec] ImportError: cannot import name 'HTMLParseError' Python 3.5.0 (v3.5.0:374f501f4567, Sep 12 2015, 11:00:19) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin I tried to solve this by myself, found something like : "HTMLParserError have been removed from python3.5" Any suggestions given that i am python ignorant? Thanks, Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
discountOverlaps option for QueryParser
Hello, Assume that term t1 is expanded into multiple terms (at the same position) during both indexing and query time. This is possible with KeywordRepeat, SynonymFilter, or the Filters that have preserveOriginal option for instance. When a two-term query (t1 t2) is executed, term t1 is boosted artificially. Score contribution of the term t1 is counted multiple times. It is like the query were issued with boosts : t1^3 t2 This behaviour boosts expanded terms and may not be always desired. E.g. (When t2 is a content-bearing word) I think there should be a flag/switch which is analogous to relationship between discountOverlaps & document's length. With this control, overlapping query terms' (tokens with a position of increment of zero) scores are counted once. Remaining terms (not overlapping ones) are not affected. Bruno asked for this functionality in the past : http://find.searchhub.org/document/bb99e435ba35f2b1 What do you think about this? How difficult to implement this? Would this be a Lucene or Solr issue? Thanks, Ahmet - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: checkJavadocLinks.py fails with Python 3.5.0
Thanks Mike, its working now. Ahmet On Wednesday, September 23, 2015 10:10 PM, Michael McCandless <luc...@mikemccandless.com> wrote: Looks like you can't be strict when parsing HTML anymore in Python3.5: http://bugs.python.org/issue15114 I'll fix checkJavadocLinks... Mike McCandless http://blog.mikemccandless.com On Wed, Sep 23, 2015 at 2:58 PM, Alan Woodward <a...@flax.co.uk> wrote: > I hit this a couple of weeks back, when homebrew automatically upgraded me > to python 3.5. I have a separate python 3.2 installation, and added this > line to ~/build.properties: > > python32.exe=/path/to/python3.2 > > Alan Woodward > www.flax.co.uk > > > On 23 Sep 2015, at 18:06, Ahmet Arslan wrote: > > Hi, > > In effort to run "ant precommit" I have installed Python 3.5.0. > However, it fails with the following : > > [exec] File > "/Volumes/data/workspace/solr-trunk/dev-tools/scripts/checkJavadocLinks.py", > line 20, in > [exec] from html.parser import HTMLParser, HTMLParseError > [exec] ImportError: cannot import name 'HTMLParseError' > > > Python 3.5.0 (v3.5.0:374f501f4567, Sep 12 2015, 11:00:19) > [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin > > I tried to solve this by myself, found something like : > "HTMLParserError have been removed from python3.5" > > Any suggestions given that i am python ignorant? > > Thanks, > Ahmet > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 5.4.0-RC1
Hi, python3 -u dev-tools/scripts/smokeTestRelease.py https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.0-RC1-rev1718046 gives me following exception: RuntimeError: JAR file "/private/tmp/smoke_lucene_5.4.0_1718046_1/unpack/lucene-5.4.0/analysis/common/lucene-analyzers-common-5.4.0.jar" is missing "X-Compile-Source-JDK: 1.8" inside its META-INF/MANIFEST.MF I am doing something wrong? Thanks, Ahmet On Tuesday, December 8, 2015 3:15 AM, "david.w.smi...@gmail.com"wrote: +1 for release. (tested with Java 7) SUCCESS! [0:56:31.943245] On Mon, Dec 7, 2015 at 8:05 PM Steve Rowe wrote: +1 > >Docs, javadocs, and changes look good. > >Smoke tester was happy with Java7 and Java8: > >SUCCESS! [1:53:58.550314] > >Steve > >> On Dec 7, 2015, at 5:31 AM, Upayavira wrote: >> >> Yes, Shalin, you are right. My fix was still required, but I clearly >> manually entered the SVN commit command wrong. Seeing as it does not >> impact upon the contents of the files, I have executed an SVN mv >> command, rerun the smoke test with the below, which worked: >> >> python3 -u dev-tools/scripts/smokeTestRelease.py >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.0-RC1-rev1718046 >> >> Please, folks, use the above to run the smoke test for this release. >> >> Upayavira >> >> On Mon, Dec 7, 2015, at 04:00 AM, Shalin Shekhar Mangar wrote: >>> Hi Upayavira, >>> >>> The svn revision in the URL is wrong. It should be 1718046 but it is >>> 178046 which makes the smoke tester fail with the following message: >>> >>> RuntimeError: JAR file >>> "/tmp/smoke_lucene_5.4.0_178046_1/unpack/lucene-5.4.0/analysis/common/lucene-analyzers-common-5.4.0.jar" >>> is missing "Implementation-Version: 5.4.0 178046 " inside its >>> META-INF/MANIFEST.MF (wrong svn revision?) >>> >>> I think you may need to generate a new RC. But perhaps an svn move to >>> a path with the right revision number may also suffice? >>> >>> On Mon, Dec 7, 2015 at 9:12 AM, Shalin Shekhar Mangar >>> wrote: Thanks Upayavira. I guess Apache has started redirecting http traffic to https recently on dist.apache.org which must have broken this script. I am able to run smoke tester after applying your patch. On Mon, Dec 7, 2015 at 2:08 AM, Upayavira wrote: > The getHREFs() method is taking in an HTTPS URL, but failing to preserve > the protocol, resulting in an HTTP call that the server naturally > bounces to HTTPS. Unfortunately, the next loop round also forgets the > HTTPS, and hence we're stuck in an endless loop. Below is a patch that > fixes this issue. I'd rather someone with more knowledge of this script > confirm my suspicion and apply the patch for us all to use, as I cannot > see how this ever worked. > > I personally ran the smoke test on my local copy, so did not hit this > HTTP/HTTPS code. I'm running the HTTP version now, and will check on it > in the morning. > > Index: dev-tools/scripts/smokeTestRelease.py > === > --- dev-tools/scripts/smokeTestRelease.py (revision 1718046) > +++ dev-tools/scripts/smokeTestRelease.py (working copy) > @@ -84,7 +84,12 @@ > # Deref any redirects > while True: > url = urllib.parse.urlparse(urlString) > -h = http.client.HTTPConnection(url.netloc) > +if url.scheme == "http": > + h = http.client.HTTPConnection(url.netloc) > +elif url.scheme == "https": > + h = http.client.HTTPSConnection(url.netloc) > +else: > + raise RuntimeError("Unknown protocol: %s" % url.scheme) > h.request('GET', url.path) > r = h.getresponse() > newLoc = r.getheader('location') > > Upayavira > > On Sun, Dec 6, 2015, at 06:26 PM, Noble Paul wrote: >> Same here. >> >> On Sun, Dec 6, 2015 at 2:36 PM, Shalin Shekhar Mangar >> wrote: >>> Is anyone able to run the smoke tester on this RC? It just hangs for a >>> long time on "loading release URL" for me. >>> >>> python3 -u dev-tools/scripts/smokeTestRelease.py --tmp-dir >>> ../smoke-5.4 --revision 178046 --version 5.4.0 --test-java8 >>> ~/programs/jdk8 >>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.0-RC1-rev178046/ >>> Java 1.7 JAVA_HOME=/home/shalin/programs/jdk7 >>> Java 1.8 JAVA_HOME=/home/shalin/programs/jdk8 >>> NOTE: output encoding is UTF-8 >>> >>> Load release URL >>> "https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.0-RC1-rev178046/;... >>> >>> I did a strace and found that the server is returning a HTTP 301 moved >>> permanently response to the http request. >>> >>> On Sat, Dec 5, 2015 at 4:28 PM, Upayavira wrote: Please
Re: [VOTE] Release Lucene/Solr 5.3.2-RC2
+1 SUCCESS! [1:38:55.940645] On Tuesday, January 19, 2016 10:25 PM, Yonik Seeleywrote: +1 -Yonik On Mon, Jan 18, 2016 at 11:23 AM, Anshum Gupta wrote: > Please vote for the RC2 release candidate for Lucene/Solr 5.3.2 > > The artifacts can be downloaded from: > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.3.2-RC2-rev1725196 > > You can run the smoke tester directly with this command: > python3 -u dev-tools/scripts/smokeTestRelease.py > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.3.2-RC2-rev1725196 > > Here's my +1 > > SUCCESS! [0:26:22.094521] > > -- > Anshum Gupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 5.4.1 RC2
+1 SUCCESS! [1:50:21.498224] On Wednesday, January 20, 2016 1:28 AM, Tomás Fernández Löbbewrote: +1 SUCCESS! [1:27:55.987215] On Tue, Jan 19, 2016 at 12:25 PM, Yonik Seeley wrote: +1 > >-Yonik > > >On Mon, Jan 18, 2016 at 9:38 AM, Adrien Grand wrote: >> Please vote for the RC2 release candidate for Lucene/Solr 5.4.1 >> >> This release candidate contains 3 additional changes compared to the RC1: >> - SOLR-8496: multi-select faceting and getDocSet(List) can match >> deleted docs >> - SOLR-8418: Adapt to changes in LUCENE-6590 for use of boosts with >> MLTHandler and Simple/CloudMLTQParser >> - SOLR-8561: Add fallback to ZkController.getLeaderProps for a mixed >> 5.4-pre-5.4 deployments >> >> The artifacts can be downloaded from: >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.1-RC2-rev1725212 >> >> You can run the smoke tester directly with this command: >> python3 -u dev-tools/scripts/smokeTestRelease.py >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-5.4.1-RC2-rev1725212 >> >> The smoke tester already passed for me both with the local and remote >> artifacts, so here is my +1. > > >- >To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 6.0.0 RC2
+1 SUCCESS! [1:42:49.802039] Ahmet On Tuesday, April 5, 2016 1:09 AM, Anshum Guptawrote: Thanks for taking this up Nick! Here's my +1: SUCCESS! [0:38:14.023246] On Fri, Apr 1, 2016 at 1:44 PM, Nicholas Knize wrote: Please vote for the RC2 release candidate for Lucene/Solr 6.0.0. > >Artifacts: > > > > > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.0-RC2-rev48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 > > >Smoke tester: > > > python3 -u dev-tools/scripts/smokeTestRelease.py > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.0-RC2-rev48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 > > > >Here's my +1: > > >SUCCESS! [0:28:59.770357] > > > >- Nick Knize -- Anshum Gupta - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Karl Wright as a Lucene/Solr committer!
Welcome Karl! On Monday, April 4, 2016 6:54 PM, Robert Muirwrote: Welcome Karl! On Mon, Apr 4, 2016 at 10:40 AM, Karl Wright wrote: > Hi all, > > Professionally, I've been active in software development since the 1970's. > My interests include many things related to software development, as well as > areas as varied as geology, carpentry, and gardening. I'm the PMC chair for > the ManifoldCF project, as well as a committer on other Apache projects such > as Http Components. > > My current employer is HERE, Inc, who is a spin-off from Nokia, who sells > map data, services, and search capabilities. > > I'm also the contributor and principal author of the Geo3D package, which is > now part of Lucene under the spatial3d module. I intend to continue to > contribute to this package for the foreseeable future. > > Thanks!! > Karl > > > On Mon, Apr 4, 2016 at 10:28 AM, Michael McCandless > wrote: >> >> I'm pleased to announce that Karl Wright has accepted the Lucene PMC's >> invitation to become a committer. >> >> Karl, it's tradition that you introduce yourself with a brief bio. >> >> Karma has been granted to your pre-existing account, so that you can >> add yourself to the committers section of the Who We Are page on the >> website: http://lucene.apache.org/whoweare.html >> >> Congratulations and welcome! >> >> Mike McCandless >> >> http://blog.mikemccandless.com > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Using rows=-1 for "give me all"
Hi Steffensen, Not sure about rows=-1, but retrieval engines are optimized to return top-N results. However, there exists special commands for "give me all" https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets Ahmet On Monday, May 23, 2016 11:38 PM, Per Steffensenwrote: Hi Back when we used 4.4.0 I believe a query with rows=-1 returned all matching documents. In 5.1.0 (the one we are using now) rows=-1 will trigger a validation exception. If I remove the code that throws that exception, it seems like rows=-1 behaves like rows=0. Has the support for rows=-1 (give me all) been reintroduced in a release after 5.1.0? If yes, which JIRA-ticket? If no, any plans to reintroduce it? Any good reason for changing the rows=-1 behavior? Am I the only one that liked it? :-) Regards, Per Steffensen - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 6.0.1 RC2
+1 SUCCESS! [1:00:26.085469] On Wednesday, May 25, 2016 11:27 AM, Tommaso Teofiliwrote: got the same warning on the GPG key signature but could not reproduce David's issue, not sure what it could be though. I'd say if no one else can reproduce it let's go ahead with the release. +1 on my side. SUCCESS! [1:19:14.997834] Regards, Tommaso Il giorno mer 25 mag 2016 alle ore 06:48 David Smiley ha scritto: I tried to run the smoke tester directly on my machine and it failed right after unpacking. Given other's success, it must be user error. What might the problem be? > > > unpack lucene-6.0.1.tgz... >verify JAR metadata/identity/no javax.* or java.* classes... >Traceback (most recent call last): > File "dev-tools/scripts/smokeTestRelease.py", line 1412, in >main() > File "dev-tools/scripts/smokeTestRelease.py", line 1356, in main >smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir, c.is_signed, ' > '.join(c.test_args)) > File "dev-tools/scripts/smokeTestRelease.py", line 1393, in smokeTest >unpackAndVerify(java, 'lucene', tmpDir, artifact, gitRevision, version, > testArgs, baseURL) > File "dev-tools/scripts/smokeTestRelease.py", line 590, in unpackAndVerify >verifyUnpacked(java, project, artifact, unpackPath, gitRevision, version, > testArgs, tmpDir, baseURL) > File "dev-tools/scripts/smokeTestRelease.py", line 712, in verifyUnpacked >checkAllJARs(os.getcwd(), project, gitRevision, version, tmpDir, baseURL) > File "dev-tools/scripts/smokeTestRelease.py", line 270, in checkAllJARs >checkJARMetaData('JAR file "%s"' % fullPath, fullPath, gitRevision, > version) > File "dev-tools/scripts/smokeTestRelease.py", line 202, in checkJARMetaData >(desc, verify)) >RuntimeError: JAR file >"/private/tmp/smoke_lucene_6.0.1_c7510a0fdd93329ec04c853c8557f4a3f2309eaf/unpack/lucene-6.0.1/analysis/common/lucene-analyzers-common-6.0.1.jar" > is missing "X-Compile-Source-JDK: 8" inside its META-INF/MANIFEST.MF > > >Separately from the smoketest, I've downloaded this RC to use it on a new >project and haven't found issues yet. > >On Tue, May 24, 2016 at 1:19 PM Anshum Gupta wrote: > >Thanks for doing the release, Steve. All looks good to me but I think you >should get someone to sign you GPG key :) >> >> >> >>I see this warning while running the tests: GPG: gpg: WARNING: This key is >>not certified with a trusted signature! >> >> >>Here's my +1! >> >> >>SUCCESS! [1:05:50.755245] >> >> >> >> >> >> >> >>On Tue, May 24, 2016 at 5:24 AM, Michael McCandless >> wrote: >> >>+1 >>> >>> >>>SUCCESS! [0:31:57.451386] >>> >>> >>> >>>Mike McCandless >>> >>>http://blog.mikemccandless.com >>> >>> >>>On Tue, May 24, 2016 at 12:13 AM, Steve Rowe wrote: >>> >>>Please vote for release candidate 2 for Lucene/Solr 6.0.1. (I found a >>>couple problems in CHANGES after I committed RC1 to Subversion, so I didn’t >>>call the vote, and cut RC2 instead.) The artifacts can be downloaded from: https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.1-RC2-revc7510a0fdd93329ec04c853c8557f4a3f2309eaf You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRelease.py \ https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-6.0.1-RC2-revc7510a0fdd93329ec04c853c8557f4a3f2309eaf Here’s my +1. Docs, changes and javadocs look good. SUCCESS! [0:26:34.596490] -- Steve www.lucidworks.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org >>> >> >> >> >> >>-- >> >>Anshum Gupta >-- > >Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker >LinkedIn: http://linkedin.com/in/davidwsmiley | Book: >http://www.solrenterprisesearchserver.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
TestUAX29URLEmailTokenizer inconsistent adding dots and apostrophes to URLs and Emails
Hi, I extracted Emails and URLs from certain TREC collections using TestUAX29URLEmailTokenizer combined with TypeTokenFilter. High Freq. terms reveal that * some e-mail addressed start with apostrophes * some e-mails or URLs end with a period. I ran a few tests and this behaviour occurs only if the entity is the first or last term in the text.If the entity is the middle of the text, UAXURLET strips apostrophes and dots. For example, "Contact me at java-u...@lucene.apache.org. or dev@lucene.apache.org." will produce java-u...@lucene.apache.org. dev@lucene.apache.orgNotice first email has a dot, while second has not. Why UAXURLET behaves different for the first/last token? Could this be a bug? It looks like dot and apostrophes are legal parts of the entities but with this abbreviations such as W.Va. D-W.Va. v.ye. are recognized as URL. I created 8 test cases to get your opinions for this one, before creating a Jira issue. public void testURLEndingWithDot2() throws IOException { BaseTokenStreamTestCase.assertAnalyzesTo(a, "My Web addresses are www.apache.org. and lucene.apache.org", new String[] {"My","Web","addresses", "are","www.apache.org","and","lucene.apache.org"}, new String[] {"","","","","","",""}); } public void testEMailStartingWithApostrophe2() throws IOException { BaseTokenStreamTestCase.assertAnalyzesTo(a, "'g...@usgs.gov 'cber_i...@a1.cber.fda.gov.", new String[] {"g...@usgs.gov","cber_i...@a1.cber.fda.gov"}, new String[] {"","","",""}); } P.S. I observed somehow similar phenomena with ICU tokenizer. ICU tokenizer sets script attribute to Latin for words that consist of numbers. But if the whole text is composed of words that consist of numbers, script attribute is set to Common. Thanks,Ahmet
lucene 6.6.0 download link redirects to 6.5.1
Hi, Lucene download page redirects to http://www-eu.apache.org/dist/lucene/java/6.5.1 for me.Solr's link is correct. Ahmet
Re: Welcome Ahmet Arslan as Lucene/Solr committer
Hi, Thanks to all for the warm welcome. It is such an honor to be invited by the PMC. I am an Assistant Professor in the Department of Computer Engineering at Anadolu University, Turkey. My current research interests include selective information retrieval and index term weighting. I started using Lucene during my master studies for academic purposes.Later on, I have worked in a number of commercial search projects using Apache Lucene/Solr. I am very proud of being part of this team! Thanks, Ahmet On Monday, December 18, 2017, 4:42:34 PM GMT+3, Steve Rowe <sar...@gmail.com> wrote: Congrats and welcome Ahmet! -- Steve www.lucidworks.com > On Dec 17, 2017, at 5:15 AM, Adrien Grand <jpou...@gmail.com> wrote: > > Hi all, > > Please join me in welcoming Ahmet Arslan as the latest Lucene/Solr committer. > Ahmet, it's tradition for you to introduce yourself with a brief bio. > > Congratulations and Welcome! > > Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 7.3.1 RC2
+1 SUCCESS! [1:19:56.690027] Ahmet On Saturday, May 12, 2018, 12:41:16 AM GMT+3, Michael McCandlesswrote: +1 SUCCESS! [0:40:57.887333] Mike McCandless http://blog.mikemccandless.com On Fri, May 11, 2018 at 1:09 PM, Adrien Grand wrote: > +1 > SUCCESS! [1:33:37.370199] > > Le mer. 9 mai 2018 à 16:59, Mark Miller a écrit : >> Even before I saw that comment, I was thinking poor Alan... >> >> - Mark >> >> >> On Wed, May 9, 2018 at 7:31 AM Alan Woodward wrote: >>> +1 >>> SUCCESS! [3:10:43.862442] >>> >>> My internet has been really very slow today... >>> >>> On Wed, May 9, 2018 at 5:50 AM, Đạt Cao Mạnh >>> wrote: Please vote for release candidate 2 for Lucene/Solr 7.3.1 The artifact can be downloaded from: https://dist.apache.org/repos/ dist/dev/lucene/lucene-solr-7. 3.1-RC2- revae0705edb59eaa567fe13ed3a22 2fdadc7153680/ You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRel ease.py https://dist.apache. org/repos/dist/dev/lucene/ lucene-solr-7.3.1-RC2- revae0705edb59eaa567fe13ed3a22 2fdadc7153680 Here’s my +1 SUCCESS! [0:53:47.443795] >>> >>> >> -- >> - Mark >> about.me/markrmiller >> > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Nhat Nguyen as Lucene/Solr committer
Congratulations and Welcome! On Tuesday, June 19, 2018, 7:20:48 PM GMT+3, Jason Gerlowski wrote: Welcome Nhat! On Tue, Jun 19, 2018 at 10:10 AM, Varun Thacker wrote: > Congratulations and welcome Nhat! > > On Tue, Jun 19, 2018 at 10:16 AM, Alan Woodward wrote: >> Welcome Nhat! >> >> >>> On 18 Jun 2018, at 21:41, Adrien Grand wrote: >>> >>> Hi all, >>> >>> Please join me in welcoming Nhat Nguyen as the latest Lucene/Solr committer. >>> Nhat, it's tradition for you to introduce yourself with a brief bio. >>> >>> Congratulations and Welcome! >>> >>> Adrien >>> >> > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 7.3.1 RC1
+1 SUCCESS! [1:15:16.705804] Ahmet On Wednesday, May 2, 2018, 9:55:04 PM GMT+3, David Smileywrote: +1 SUCCESS! [1:04:51.914445] On Wed, May 2, 2018 at 12:32 PM Michael McCandless wrote: > +1 > > SUCCESS! [0:49:04.927108] > > Mike McCandless > > http://blog.mikemccandless.com > > On Wed, May 2, 2018 at 6:40 AM, Đạt Cao Mạnh wrote: >> Please vote for release candidate 1 for Lucene/Solr 7.3.1 >> >> The artifacts can be downloaded from: >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.3.1-RC1-rev8fa7687413558b3bc65cbbbeb722a21314187e6a >> >> You can run the smoke tester directly with this command: >> >> python3 -u dev-tools/scripts/smokeTestRelease.py \ >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.3.1-RC1-rev8fa7687413558b3bc65cbbbeb722a21314187e6a >> >> Here's my +1 >> SUCCESS! [0:52:14.381028] >> > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Dennis Gove to the PMC
Congratulations Dennis! Ahmet On Wednesday, December 27, 2017, 7:56:58 PM GMT+3, Dawid Weisswrote: Congratulations Dennis! Dawid On Wed, Dec 27, 2017 at 5:37 PM, Anshum Gupta wrote: > Congratulations and welcome Dennis! > > On Wed, Dec 27, 2017 at 4:59 PM Steve Rowe wrote: >> >> Congrats and welcome Dennis! >> >> -- >> Steve >> www.lucidworks.com >> >> > On Dec 26, 2017, at 8:12 AM, Joel Bernstein wrote: >> > >> > I am pleased to announce that Dennis Gove has accepted the PMC's >> > invitation to join. >> > >> > Welcome Dennis! >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Release Lucene/Solr 7.2.1 RC1
+1 SUCCESS! [4:00:41.664562] Ahmet On Saturday, January 13, 2018, 7:06:47 PM GMT+3, Kevin Risdenwrote: Ishan - Try docker run -it openjdk:9-jdk. java was replaced with openjdk. java:9-jdk has version 9b149 where as openjdk:9-jdk has version 9.0.1-11. This should have been fixed before Java 9 GA. https://github.com/docker- library/openjdk/issues/101 Kevin Risden On Sat, Jan 13, 2018 at 6:09 AM, Ishan Chattopadhyaya wrote: This also happens with 7.2.0 and 7.1.0. Could be something to do with the official Java image. Nothing that stops the RC, I think. On Sat, Jan 13, 2018 at 5:11 PM, Ishan Chattopadhyaya wrote: I spun up a docker container with Java 9 (java:9-jdk) from docker hub [0]. Downloaded the Solr 7.2.1 RC1 tarball and unzipped it. Tried to start it, but it failed citing some crypto issue: https://gist.github.com/anonym ous/ed1a179b1043190b5f6fd635c6 a47f23 I'm trying out the same for 7.2.0 and earlier versions to see if this is a recent regression. [0] - docker run -it java:9-jdk On Wed, Jan 10, 2018 at 11:04 PM, Adrien Grand wrote: +1 SUCCESS! [1:29:47.999770] Le mer. 10 janv. 2018 à 18:03, Tomas Fernandez Lobbe a écrit : +1 SUCCESS! [1:04:34.912689] On Jan 10, 2018, at 8:01 AM, Alan Woodward wrote: +1 SUCCESS! [1:43:16.772919] I need to get a new test machine... On 10 Jan 2018, at 09:51, Dawid Weiss wrote: +1 SUCCESS! [1:31:30.029815] Dawid On Wed, Jan 10, 2018 at 10:46 AM, Shalin Shekhar Mangar wrote: +1 SUCCESS! [1:13:22.042124] On Wed, Jan 10, 2018 at 8:00 AM, jim ferenczi wrote: Please vote for release candidate 1 for Lucene/Solr 7.2.1 The artifacts can be downloaded from: https://dist.apache.org/repos/ dist/dev/lucene/lucene-solr-7. 2.1-RC1-revb2b6438b37073bee1fc a40374e85bf91aa457c0b You can run the smoke tester directly with this command: python3 -u dev-tools/scripts/smokeTestRel ease.py \ https://dist.apache.org/repos/ dist/dev/lucene/lucene-solr-7. 2.1-RC1-revb2b6438b37073bee1fc a40374e85bf91aa457c0b Here's my +1 SUCCESS! [0:38:10.689623] -- Regards, Shalin Shekhar Mangar. -- -- - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org For additional commands, e-mail: dev-h...@lucene.apache.org -- -- - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org For additional commands, e-mail: dev-h...@lucene.apache.org -- -- - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Ignacio Vera as Lucene/Solr committer
Congratulations Ignacio! Ahmet On Thursday, January 11, 2018, 9:43:50 PM GMT+3, Martin Gaintywrote: ¡Bienvendos Ignacio! Martín __ From: Erick Erickson Sent: Thursday, January 11, 2018 12:39 PM To: dev@lucene.apache.org Subject: Re: Welcome Ignacio Vera as Lucene/Solr committer Welcome Ignacio! On Thu, Jan 11, 2018 at 9:09 AM, Karl Wright wrote: > > Welcome, Ignacio! > Karl > > > > > > > On Thu, Jan 11, 2018 at 11:46 AM, Steve Rowe wrote: > >> Congrats and welcome Ignacio! >> >> -- >> Steve >> www.lucidworks.com >> >> >> >>> On Jan 11, 2018, at 11:35 AM, Adrien Grand wrote: >>> >>> Hi all, >>> >>> Please join me in welcoming Ignacio Vera as the latest Lucene/Solr >>> committer. >>> Ignacio, it's tradition for you to introduce yourself with a brief bio. >>> >>> Congratulations and Welcome! >> >> >> >> >> -- -- - >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache. org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > > > > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jason Gerlowski as committer
Congratulations and welcome Jason! On Friday, February 9, 2018, 11:58:06 AM GMT+3, Alan Woodwardwrote: Welcome Jason! > On 8 Feb 2018, at 17:02, David Smiley wrote: > > Hello everyone, > > It's my pleasure to announce that Jason Gerlowski is our latest committer for > Lucene/Solr in recognition for his contributions to the project! Please join > me in welcoming him. Jason, it's tradition for you to introduce yourself > with a brief bio. > > Congratulations and Welcome! > -- > Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker > LinkedIn: http://linkedin.com/in/davidwsmiley | Book: > http://www.solrenterprisesearchserver.com > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Karl Wright to the PMC
Congratulations Karl! Ahmet On Thursday, December 28, 2017, 7:32:41 PM GMT+3, Steve Rowewrote: Congrats and welcome Karl! -- Steve www.lucidworks.com > On Dec 28, 2017, at 9:08 AM, Adrien Grand wrote: > > I am pleased to announce that Karl Wright has accepted the PMC's invitation > to join. > > Welcome Karl! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Gus Heck as Lucene/Solr committer
Congratulations! On Friday, November 2, 2018, 7:13:35 PM GMT+3, Varun Thacker wrote: Congratulations and welcome Gus! On Thu, Nov 1, 2018 at 5:22 AM David Smiley wrote: Hi all, Please join me in welcoming Gus Heck as the latest Lucene/Solr committer! Congratulations and Welcome, Gus! Gus, it's traditional for you to introduce yourself with a brief bio. ~ David-- Lucene/Solr Search Committer, Consultant, Developer, Author, SpeakerLinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
Re: Welcome Tim Allison as a Lucene/Solr committer
Congratulations ! On Saturday, November 3, 2018, 1:43:31 AM GMT+3, Nhat Nguyen wrote: Welcome Tim! On Fri, Nov 2, 2018 at 6:33 PM Tommaso Teofili wrote: Welcome Tim!!! Tommaso Il giorno ven 2 nov 2018 alle ore 22:30 Steve Rowe ha scritto: > > Welcome Tim! > > Steve > > On Fri, Nov 2, 2018 at 12:20 PM Erick Erickson > wrote: >> >> Hi all, >> >> Please join me in welcoming Tim Allison as the latest Lucene/Solr committer! >> >> Congratulations and Welcome, Tim! >> >> It's traditional for you to introduce yourself with a brief bio. >> >> Erick >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Tomoko Uchida as Lucene/Solr committer
Congralations Tomoko! On Tuesday, April 9, 2019, 8:48:03 PM GMT+3, Robert Muir wrote: Welcome! On Mon, Apr 8, 2019 at 11:21 AM Uwe Schindler wrote: > > Hi all, > > Please join me in welcoming Tomoko Uchida as the latest Lucene/Solr committer! > > She has been working on https://issues.apache.org/jira/browse/LUCENE-2562 for > several years with awesome progress and finally we got the fantastic Luke as > a branch on ASF JIRA: > https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=shortlog;h=refs/heads/jira/lucene-2562-luke-swing-3 > Looking forward to the first release of Apache Lucene 8.1 with Luke bundled > in the distribution. I will take care of merging it to master and 8.x > branches together with her once she got the ASF account. > > Tomoko also helped with the Japanese and Korean Analyzers. > > Congratulations and Welcome, Tomoko! Tomoko, it's traditional for you to > introduce yourself with a brief bio. > > Uwe & Robert (who nominated Tomoko) > > - > Uwe Schindler > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de > eMail: u...@thetaphi.de > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007236#comment-13007236 ] Ahmet Arslan commented on SOLR-1499: Hi, Can i use this to upgrade solr version? Where the lucene/solr indices are not compatible? Thanks, Ahmet SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ - Key: SOLR-1499 URL: https://issues.apache.org/jira/browse/SOLR-1499 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Lance Norskog Assignee: Erik Hatcher Fix For: Next Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch The SolrEntityProcessor queries an external Solr instance. The Solr documents returned are unpacked and emitted as DIH fields. The SolrEntityProcessor uses the following attributes: * solr='http://localhost:8983/solr/sms' ** This gives the URL of the target Solr instance. *** Note: the connection to the target Solr uses the binary SolrJ format. * query='Jeffersonsort=id+asc' ** This gives the base query string use with Solr. It can include any standard Solr request parameter. This attribute is processed under the variable resolution rules and can be driven in an inner stage of the indexing pipeline. * rows='10' ** This gives the number of rows to fetch per request.. ** The SolrEntityProcessor always fetches every document that matches the request.. * fields='id,tag' ** This selects the fields to be returned from the Solr request. ** These must also be declared as field elements. ** As with all fields, template processors can be used to alter the contents to be passed downwards. * timeout='30' ** This limits the query to 5 seconds. This can be used as a fail-safe to prevent the indexing session from freezing up. By default the timeout is 5 minutes. Limitations: * Solr errors are not handled correctly. * Loop control constructs have not been tested. * Multi-valued returned fields have not been tested. The unit tests give examples of how to use it as the root entity and an inner entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007516#comment-13007516 ] Ahmet Arslan commented on SOLR-1499: Hi Lance, I setup patch to latest trunk. It required some change though. I pointed out a solr URL (version 1.4.0) to upgrade from 1.4.0 to trunk. I received : Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 1) or the data in not in 'javabin' format at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:99) at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:478) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:245) What can be a work around to overcome this? SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ - Key: SOLR-1499 URL: https://issues.apache.org/jira/browse/SOLR-1499 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Lance Norskog Assignee: Erik Hatcher Fix For: Next Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch The SolrEntityProcessor queries an external Solr instance. The Solr documents returned are unpacked and emitted as DIH fields. The SolrEntityProcessor uses the following attributes: * solr='http://localhost:8983/solr/sms' ** This gives the URL of the target Solr instance. *** Note: the connection to the target Solr uses the binary SolrJ format. * query='Jeffersonsort=id+asc' ** This gives the base query string use with Solr. It can include any standard Solr request parameter. This attribute is processed under the variable resolution rules and can be driven in an inner stage of the indexing pipeline. * rows='10' ** This gives the number of rows to fetch per request.. ** The SolrEntityProcessor always fetches every document that matches the request.. * fields='id,tag' ** This selects the fields to be returned from the Solr request. ** These must also be declared as field elements. ** As with all fields, template processors can be used to alter the contents to be passed downwards. * timeout='30' ** This limits the query to 5 seconds. This can be used as a fail-safe to prevent the indexing session from freezing up. By default the timeout is 5 minutes. Limitations: * Solr errors are not handled correctly. * Loop control constructs have not been tested. * Multi-valued returned fields have not been tested. The unit tests give examples of how to use it as the root entity and an inner entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007700#comment-13007700 ] Ahmet Arslan commented on SOLR-1499: Eric, Thanks for the pointer. As you said when I use new CommonsHttpSolrServer(new URL(http://solr1.4.0Instance:8080/solr;), null, new XMLResponseParser(), false); I was able to communicate to solr 1.4.0 instance using solrj-trunk. Do you recommend modifying this patch in this manner? Any performance hits? Plus, What do you think about copy-pasting JavaBinCodec.java from source version to destination version and Using a custom BinaryResponseParser that uses that copy-paste class? Seems working for 1.4.0 to trunk. Or should i stick with writing a little script to do it? P.S. I am just trying to use a feature that will be already maintained by solr commnunity. SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ - Key: SOLR-1499 URL: https://issues.apache.org/jira/browse/SOLR-1499 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Lance Norskog Assignee: Erik Hatcher Fix For: Next Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch The SolrEntityProcessor queries an external Solr instance. The Solr documents returned are unpacked and emitted as DIH fields. The SolrEntityProcessor uses the following attributes: * solr='http://localhost:8983/solr/sms' ** This gives the URL of the target Solr instance. *** Note: the connection to the target Solr uses the binary SolrJ format. * query='Jeffersonsort=id+asc' ** This gives the base query string use with Solr. It can include any standard Solr request parameter. This attribute is processed under the variable resolution rules and can be driven in an inner stage of the indexing pipeline. * rows='10' ** This gives the number of rows to fetch per request.. ** The SolrEntityProcessor always fetches every document that matches the request.. * fields='id,tag' ** This selects the fields to be returned from the Solr request. ** These must also be declared as field elements. ** As with all fields, template processors can be used to alter the contents to be passed downwards. * timeout='30' ** This limits the query to 5 seconds. This can be used as a fail-safe to prevent the indexing session from freezing up. By default the timeout is 5 minutes. Limitations: * Solr errors are not handled correctly. * Loop control constructs have not been tested. * Multi-valued returned fields have not been tested. The unit tests give examples of how to use it as the root entity and an inner entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-1499: --- Attachment: SOLR-1499.patch Bring upto trunk version 1082579. Add (format=javabin|xml) parameter. xml is needed for solr upgrade where solr versions are not compatible. Test cases needs to be updated. SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ - Key: SOLR-1499 URL: https://issues.apache.org/jira/browse/SOLR-1499 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Lance Norskog Fix For: Next Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch The SolrEntityProcessor queries an external Solr instance. The Solr documents returned are unpacked and emitted as DIH fields. The SolrEntityProcessor uses the following attributes: * solr='http://localhost:8983/solr/sms' ** This gives the URL of the target Solr instance. *** Note: the connection to the target Solr uses the binary SolrJ format. * query='Jeffersonsort=id+asc' ** This gives the base query string use with Solr. It can include any standard Solr request parameter. This attribute is processed under the variable resolution rules and can be driven in an inner stage of the indexing pipeline. * rows='10' ** This gives the number of rows to fetch per request.. ** The SolrEntityProcessor always fetches every document that matches the request.. * fields='id,tag' ** This selects the fields to be returned from the Solr request. ** These must also be declared as field elements. ** As with all fields, template processors can be used to alter the contents to be passed downwards. * timeout='30' ** This limits the query to 5 seconds. This can be used as a fail-safe to prevent the indexing session from freezing up. By default the timeout is 5 minutes. Limitations: * Solr errors are not handled correctly. * Loop control constructs have not been tested. * Multi-valued returned fields have not been tested. The unit tests give examples of how to use it as the root entity and an inner entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059333#comment-13059333 ] Ahmet Arslan commented on SOLR-1499: Lance, I used it once to upgrade. SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ - Key: SOLR-1499 URL: https://issues.apache.org/jira/browse/SOLR-1499 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Lance Norskog Fix For: 3.4, 4.0 Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch The SolrEntityProcessor queries an external Solr instance. The Solr documents returned are unpacked and emitted as DIH fields. The SolrEntityProcessor uses the following attributes: * solr='http://localhost:8983/solr/sms' ** This gives the URL of the target Solr instance. *** Note: the connection to the target Solr uses the binary SolrJ format. * query='Jeffersonsort=id+asc' ** This gives the base query string use with Solr. It can include any standard Solr request parameter. This attribute is processed under the variable resolution rules and can be driven in an inner stage of the indexing pipeline. * rows='10' ** This gives the number of rows to fetch per request.. ** The SolrEntityProcessor always fetches every document that matches the request.. * fields='id,tag' ** This selects the fields to be returned from the Solr request. ** These must also be declared as field elements. ** As with all fields, template processors can be used to alter the contents to be passed downwards. * timeout='30' ** This limits the query to 5 seconds. This can be used as a fail-safe to prevent the indexing session from freezing up. By default the timeout is 5 minutes. Limitations: * Solr errors are not handled correctly. * Loop control constructs have not been tested. * Multi-valued returned fields have not been tested. The unit tests give examples of how to use it as the root entity and an inner entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2208) Token div exceeds length of provided text sized 4114
[ https://issues.apache.org/jira/browse/LUCENE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059836#comment-13059836 ] Ahmet Arslan commented on LUCENE-2208: -- Hello, I am using very recent trunk, I received the same exception (InvalidTokenOffsetsException) with PatternReplaceCharFilter. I observed that HTMLStripCharFilter sometimes causes wrong words getting highlighted. So I was playing with PatternReplaceCharFilter to somehow remove html tags hoping highlighting won't be broken this time. I remember tokenizer versions of htmlStrip has problems with highlighting. It seems that it is continued with charFilters. Hsiu Wang, do you think the reason (HTMLStripCharFilter causes wrong words getting highlighted) is the same as here (what you explained here)? Token div exceeds length of provided text sized 4114 Key: LUCENE-2208 URL: https://issues.apache.org/jira/browse/LUCENE-2208 Project: Lucene - Java Issue Type: Bug Components: modules/highlighter Affects Versions: 3.0 Environment: diagnostics = {os.version=5.1, os=Windows XP, lucene.version=3.0.0 883080 - 2009-11-22 15:43:58, source=flush, os.arch=x86, java.version=1.6.0_12, java.vendor=Sun Microsystems Inc.} Reporter: Ramazan VARLIKLI Attachments: LUCENE-2208.patch, LUCENE-2208_test.patch I have a doc which contains html codes. I want to strip html tags and make the test clear after then apply highlighter on the clear text . But highlighter throws an exceptions if I strip out the html characters , if i don't strip out , it works fine. It just confuses me at the moment I copy paste 3 thing here from the console as it may contain special characters which might cause the problem. 1 -) Here is the html text h2Starter/h2 div id=tab1-content class=tabContent selected div class=head/div div class=body div class=subject-headerLearning path: History/div h3Key question/h3 pDid transport fuel the industrial revolution?/p h3Learning Objective/h3 ul liTo categorise points as for or against an argument/li /ul p h3What to do?/h3 ul liWatch the clip: emTransport fuelled the industrial revolution./em/li /ul pThe clips claims that transport fuelled the industrial revolution. Some historians argue that the industrial revolution only happened because of developments in transport./p ul liRead the statements below and decide which points are emfor/em and which points are emagainst/em the argument that industry expanded in the 18th and 19th centuries because of developments in transport./li /ul ol type=a liIndustry expanded because of inventions and the discovery of steam power./li liImprovements in transport allowed goods to be sold all over the country and all over the world so there were more customers to develop industry for./li liDevelopments in transport allowed resources, such as coal from mines and cotton from America to come together to manufacture products./li liTransport only developed because industry needed it. It was slow to develop as money was spent on improving roads, then building canals and the replacing them with railways in order to keep up with industry./li /ol pNow try to think of 2 more statements of your own./p /div div class=foot/div /div h2Main activity/h2 div id=tab2-content class=tabContent div class=head/div div class=bodydiv class=subject-headerLearning path: History/div h3Learning Objective/h3 ul liTo select evidence to support points/li /ul h3What to do?/h3 !--ul liWatch the clip: emWindmill and water mill/em/li /ul-- ulliChoose the 4 points that you think are most important - try to be balanced by having two strongfor/strong and two strongagainst/strong./li liWrite one in each of the point boxes of the paragraphs on the sheet a href=lp_history_industry_transport_ws1.html class=link-internalConstructing a balanced argument/a./li/ul pYou might like to re write the points in your own words and use connectives
[jira] [Updated] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-1604: --- Attachment: ComplexPhrase.zip Update for solr 3.3.0 * Download apache-solr-3.3.0-src.tgz * Download most latest ComplexPhrase.zip * 'mvn package' will generate 3 files under target folder copy them to apache-solr-3.3.0/solr/lib/ ** cp target/ComplexPhrase-* Downloads/apache-solr-3.3.0/solr/lib/ * call 'ant clean dist' to create a new apache-solr-3.3-SNAPSHOT.war file under solr/dist folder Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 3.4, 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2649) MM ignored in edismax queries with operators
[ https://issues.apache.org/jira/browse/SOLR-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068925#comment-13068925 ] Ahmet Arslan commented on SOLR-2649: I experienced the same issue. When i added one negative clause to the query string (that has two optional clauses), mm is ignored and default operator is used instead. q=word1 word2 -word3mm=100%defType=edismax and q=word1 word2 -word3mm=100%defType=dismax returns different result sets. edismax returns documents containing either word1 or word2, although there are two optional clauses in the query and mm is set to 100%. MM ignored in edismax queries with operators Key: SOLR-2649 URL: https://issues.apache.org/jira/browse/SOLR-2649 Project: Solr Issue Type: Bug Components: search Affects Versions: 3.3 Reporter: Magnus Bergmark Priority: Minor Hypothetical scenario: 1. User searches for stocks oil gold with MM set to 50% 2. User adds -stockings to the query: stocks oil gold -stockings 3. User gets no hits since MM was ignored and all terms where AND-ed together The behavior seems to be intentional, although the reason why is never explained: // For correct lucene queries, turn off mm processing if there // were explicit operators (except for AND). boolean doMinMatched = (numOR + numNOT + numPluses + numMinuses) == 0; (lines 232-234 taken from tags/lucene_solr_3_3/solr/src/java/org/apache/solr/search/ExtendedDismaxQParserPlugin.java) This makes edismax unsuitable as an replacement to dismax; mm is one of the primary features of dismax. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12859048#action_12859048 ] Ahmet Arslan commented on SOLR-1604: To enable ComplexPhraseQueryParser in Solr 1.4.0 - Revisited Due to the reasons revealed here [1] this plugin should be loaded using the old way [2] [1] http://search-lucene.com/m/E49gN1naPyh [2] http://wiki.apache.org/solr/SolrPlugins#The_Old_Way 1-) extract ComplexPhrase.zip and run 'mvn package' 2-) unzip apache-solr-1.4.0.zip and copy ComplexPhrase/target/ComplexPhrase-1.0.jar to apache-solr-1.4.0/lib directory. 3-) create a new apache-solr-1.4.0\dist\apache-solr-1.4.1-dev.war (by running 'ant dist') and use it. 4-) register queryparser to solrhome/conf/solrconfig.xml by adding queryParser name=complexphrase class=org.apache.solr.search.ComplexPhraseQParserPlugin / 5-) enable it by appending defType=complexphrase to search url. 6-) Alternatively you can add {!complexphrase} in front of your query string. e.g. q={!complexphrase}s* b* 7-) More permanent usage can be configured in solrconfig.xml requestHandler name=standard class=solr.StandardRequestHandler default=true lst name=defaults str name=defTypecomplexphrase/str /lst /requestHandler Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 1.5 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-1604: --- Attachment: ComplexPhrase.zip Includes README.txt that contain instruction for Solr 4.0.0 Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: query parsers, search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--ComplexPhrase.zip, ComplexPhraseQueryParser.java, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, SOLR-1604-alternative.patch, SOLR-1604.patch, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3000) Add support for ComplexPhraseQueryParser
[ https://issues.apache.org/jira/browse/SOLR-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179338#comment-13179338 ] Ahmet Arslan commented on SOLR-3000: I think this is duplicate of [SOLR-1604|https://issues.apache.org/jira/browse/SOLR-1604]? Add support for ComplexPhraseQueryParser Key: SOLR-3000 URL: https://issues.apache.org/jira/browse/SOLR-3000 Project: Solr Issue Type: New Feature Reporter: Santiago M. Mola It would be useful to have support for queries such as: my phrse~0.5 queri~0.5~2, as those provided by Lucene's ComplexPhraseQueryParser. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3060) add highlighter support to SurroundQParserPlugin
add highlighter support to SurroundQParserPlugin - Key: SOLR-3060 URL: https://issues.apache.org/jira/browse/SOLR-3060 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Highlighter does not recognize SrndQuery family. http://search-lucene.com/m/FuDsU1sTjgM http://search-lucene.com/m/wD8c11gNTb61 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3060) add highlighter support to SurroundQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-3060: --- Attachment: SOLR-3060.patch o.a.s.search.QParser#getHighlightQuery() method is overridden. add highlighter support to SurroundQParserPlugin - Key: SOLR-3060 URL: https://issues.apache.org/jira/browse/SOLR-3060 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: SOLR-3060.patch Highlighter does not recognize SrndQuery family. http://search-lucene.com/m/FuDsU1sTjgM http://search-lucene.com/m/wD8c11gNTb61 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3060) add highlighter support to SurroundQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-3060: --- Attachment: SOLR-3060.patch some test added add highlighter support to SurroundQParserPlugin - Key: SOLR-3060 URL: https://issues.apache.org/jira/browse/SOLR-3060 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: SOLR-3060.patch, SOLR-3060.patch Highlighter does not recognize SrndQuery family. http://search-lucene.com/m/FuDsU1sTjgM http://search-lucene.com/m/wD8c11gNTb61 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3060) add highlighter support to SurroundQParserPlugin
[ https://issues.apache.org/jira/browse/SOLR-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195979#comment-13195979 ] Ahmet Arslan commented on SOLR-3060: The following commands should do it. * svn checkout http://svn.apache.org/repos/asf/lucene/dev/trunk * cd trunk * curl -O https://issues.apache.org/jira/secure/attachment/12511843/SOLR-3060.patch * patch -p0 -i SOLR-3060.patch * cd solr * ant clean dist Use newly created trunk/solr/dist/apache-solr-4.0-SNAPSHOT.war file. add highlighter support to SurroundQParserPlugin - Key: SOLR-3060 URL: https://issues.apache.org/jira/browse/SOLR-3060 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: SOLR-3060.patch, SOLR-3060.patch Highlighter does not recognize SrndQuery family. http://search-lucene.com/m/FuDsU1sTjgM http://search-lucene.com/m/wD8c11gNTb61 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3074) Bug in SolrPluginUtilsTest
Bug in SolrPluginUtilsTest -- Key: SOLR-3074 URL: https://issues.apache.org/jira/browse/SOLR-3074 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.0 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 testDocListConversion() is not testing what it's suppossed to test. Because added test documents are not committed. http://search-lucene.com/m/uwh9l2SHH4e -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3074) Bug in SolrPluginUtilsTest
[ https://issues.apache.org/jira/browse/SOLR-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-3074: --- Attachment: SOLR-3074.patch Bug in SolrPluginUtilsTest -- Key: SOLR-3074 URL: https://issues.apache.org/jira/browse/SOLR-3074 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.0 Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: SOLR-3074.patch testDocListConversion() is not testing what it's suppossed to test. Because added test documents are not committed. http://search-lucene.com/m/uwh9l2SHH4e -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13199774#comment-13199774 ] Ahmet Arslan commented on SOLR-1604: Committed o.a.l.queryparser.complexPhrase.ComplexPhraseQueryParser does not work with non-default fields. Several Lucene users raised this issue on mailing lists. Mark Harwood said the following on LUCENE-1486 which is still unresolved. However it didn't get any attention. {quote}Fixing this would require changing the package name of ComplexPhraseQueryParser or changing the visibility of field in the QueryParser base class to protected. Anyone have any strong feelings about which of these is the most acceptable?{quote} That's why attachment of this issue is not consuming committed o.a.l.queryparser.complexPhrase.ComplexPhraseQueryParser and released as a solr-plugin. Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 3.6, 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13200020#comment-13200020 ] Ahmet Arslan commented on SOLR-1604: I imagined that LUCENE-1486 will be closed/fixed in the future, hopefully including the non default field patch. Are you saying that non default field problem should be handled in a separate issue (other than LUCENE-1486) ? Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 3.6, 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-1486: - Attachment: LUCENE-1486.patch Mark's and Tomas' non default field patches are combined. Wildcards, ORs etc inside Phrase queries Key: LUCENE-1486 URL: https://issues.apache.org/jira/browse/LUCENE-1486 Project: Lucene - Java Issue Type: Improvement Components: core/queryparser Affects Versions: 2.4 Reporter: Mark Harwood Priority: Minor Fix For: 4.0 Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, junit_complex_phrase_qp_07_22_2009.patch An extension to the default QueryParser that overrides the parsing of PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries. The implementation feels a little hacky - this is arguably better handled in QueryParser itself. This works as a proof of concept for much of the query parser syntax. Examples from the Junit test include: checkMatches(\j* smyth~\, 1,2); //wildcards and fuzzies are OK in phrases checkMatches(\(jo* -john) smith\, 2); // boolean logic works checkMatches(\jo* smith\~2, 1,2,3); // position logic works. checkBadQuery(\jo* id:1 smith\); //mixing fields in a phrase is bad checkBadQuery(\jo* \smith\ \); //phrases inside phrases is bad checkBadQuery(\jo* [sma TO smZ]\ \); //range queries inside phrases not supported Code plus Junit test to follow... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-1486) Wildcards, ORs etc inside Phrase queries
[ https://issues.apache.org/jira/browse/LUCENE-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203514#comment-13203514 ] Ahmet Arslan commented on LUCENE-1486: -- Thanks for looking into this, Mark and Tomas. Do you think this issue is the right place to introduce boolean inOrder parameter? Currently always inOrder=true is passed to SpanNearQuery's ctor. Wildcards, ORs etc inside Phrase queries Key: LUCENE-1486 URL: https://issues.apache.org/jira/browse/LUCENE-1486 Project: Lucene - Java Issue Type: Improvement Components: core/queryparser Affects Versions: 2.4 Reporter: Mark Harwood Priority: Minor Fix For: 4.0 Attachments: ComplexPhraseQueryParser.java, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, LUCENE-1486.patch, Lucene-1486 non default field.patch, TestComplexPhraseQuery.java, junit_complex_phrase_qp_07_21_2009.patch, junit_complex_phrase_qp_07_22_2009.patch An extension to the default QueryParser that overrides the parsing of PhraseQueries to allow more complex syntax e.g. wildcards in phrase queries. The implementation feels a little hacky - this is arguably better handled in QueryParser itself. This works as a proof of concept for much of the query parser syntax. Examples from the Junit test include: checkMatches(\j* smyth~\, 1,2); //wildcards and fuzzies are OK in phrases checkMatches(\(jo* -john) smith\, 2); // boolean logic works checkMatches(\jo* smith\~2, 1,2,3); // position logic works. checkBadQuery(\jo* id:1 smith\); //mixing fields in a phrase is bad checkBadQuery(\jo* \smith\ \); //phrases inside phrases is bad checkBadQuery(\jo* [sma TO smZ]\ \); //range queries inside phrases not supported Code plus Junit test to follow... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-1604: --- Attachment: SOLR-1604.patch path for trunk Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: 3.6, 4.0 Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3758) Allow the ComplexPhraseQueryParser to search order or un-order proximity queries.
[ https://issues.apache.org/jira/browse/LUCENE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated LUCENE-3758: - Attachment: LUCENE-3758.patch patch for trunk Allow the ComplexPhraseQueryParser to search order or un-order proximity queries. - Key: LUCENE-3758 URL: https://issues.apache.org/jira/browse/LUCENE-3758 Project: Lucene - Java Issue Type: Improvement Components: core/queryparser Affects Versions: 4.0 Reporter: Tomás Fernández Löbbe Priority: Minor Fix For: 4.0 Attachments: LUCENE-3758.patch The ComplexPhraseQueryParser use SpanNearQuery, but always set the inOrder value hardcoded to true. This could be configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3193) highligting on an unindexed field throws InvalidTokenOffsetsException
highligting on an unindexed field throws InvalidTokenOffsetsException - Key: SOLR-3193 URL: https://issues.apache.org/jira/browse/SOLR-3193 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 3.6 Reporter: Ahmet Arslan Priority: Minor When highlighting is requested on an un-indexed field (for the second time), InvalidTokenOffsetsException is thrown. http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-td3560997.html#a3793593 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3193) highligting on an unindexed field throws InvalidTokenOffsetsException
[ https://issues.apache.org/jira/browse/SOLR-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-3193: --- Attachment: SOLR-3193.patch TestCase that demonstrates bug. highligting on an unindexed field throws InvalidTokenOffsetsException - Key: SOLR-3193 URL: https://issues.apache.org/jira/browse/SOLR-3193 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 3.6 Reporter: Ahmet Arslan Priority: Minor Attachments: SOLR-3193.patch When highlighting is requested on an un-indexed field (for the second time), InvalidTokenOffsetsException is thrown. http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-td3560997.html#a3793593 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3193) highligting on an unindexed field throws InvalidTokenOffsetsException
[ https://issues.apache.org/jira/browse/SOLR-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221083#comment-13221083 ] Ahmet Arslan commented on SOLR-3193: If solr.ReversedWildcardFilterFactory is removed from index analyzer, then attached test passes. highligting on an unindexed field throws InvalidTokenOffsetsException - Key: SOLR-3193 URL: https://issues.apache.org/jira/browse/SOLR-3193 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 3.6 Reporter: Ahmet Arslan Priority: Minor Attachments: SOLR-3193.patch When highlighting is requested on an un-indexed field (for the second time), InvalidTokenOffsetsException is thrown. http://lucene.472066.n3.nabble.com/search-highlight-InvalidTokenOffsetsException-in-Solr-3-5-td3560997.html#a3793593 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-1604: --- Attachment: ComplexPhrase.zip There is a need for un-ordered proximity search. http://search-lucene.com/m/3W9fj2yzNy82/ Configurable inOrder parameter is added where default behavior is {color:blue} true {color}. The configuration below can be used to obtain same behavior of [PhraseQuery|http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/search/PhraseQuery.html] in which order of terms is not important. {code:xml} queryParser name=complexphrase class=org.apache.solr.search.ComplexPhraseQParserPlugin bool name=inOrderfalse/bool /queryParser {code} Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: Next Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
[ https://issues.apache.org/jira/browse/SOLR-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-1604: --- Attachment: ComplexPhrase.zip Hyphen inside the phrase causes exception. e.g. sulfur-reducing bacteria Terje Eggestad's [fix|https://issues.apache.org/jira/browse/LUCENE-1486?focusedCommentId=12900278page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12900278] is added. Wildcards, ORs etc inside Phrase Queries Key: SOLR-1604 URL: https://issues.apache.org/jira/browse/SOLR-1604 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.4 Reporter: Ahmet Arslan Priority: Minor Fix For: Next Attachments: ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhrase.zip, ComplexPhraseQueryParser.java, SOLR-1604.patch Solr Plugin for ComplexPhraseQueryParser (LUCENE-1486) which supports wildcards, ORs, ranges, fuzzies inside phrase queries. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1604) Wildcards, ORs etc inside Phrase Queries
Ahmet Arslan commented on SOLR-1604 Wildcards, ORs etc inside Phrase Queries Could I get the grammar file (.jj file) for ComplexPhrase one - It's not there as part of the patch/zip file. It does not have a separate grammar file. It just extends QueryPaser. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3759) mistakes about example-DIH
Ahmet Arslan created SOLR-3759 mistakes about example-DIH Issue Type: Bug Assignee: Unassigned Components: contrib - DataImportHandler, documentation Created: 26/Aug/12 17:23 Description: mail core's solrconfig.xml lacks lib directive for contrib/extraction/lib. Project: Solr Priority: Minor Reporter: Ahmet Arslan This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3759) mistakes about example-DIH
Ahmet Arslan updated SOLR-3759 mistakes about example-DIH Change By: Ahmet Arslan (26/Aug/12 17:24) Attachment: SOLR-3759.patch This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3759) mistakes about example-DIH
Ahmet Arslan updated SOLR-3759 mistakes about example-DIH missing AdminHandlers for tika core and PingRequestHandler for all cores. Change By: Ahmet Arslan (26/Aug/12 17:49) Attachment: SOLR-3759.patch This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3759) mistakes about example-DIH
[ https://issues.apache.org/jira/browse/SOLR-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-3759: --- Fix Version/s: 4.0 mistakes about example-DIH -- Key: SOLR-3759 URL: https://issues.apache.org/jira/browse/SOLR-3759 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler, documentation Reporter: Ahmet Arslan Priority: Minor Fix For: 4.0 Attachments: SOLR-3759.patch, SOLR-3759.patch mail core's solrconfig.xml lacks lib directive for contrib/extraction/lib. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3779) LineEntityProcessor processes only one document
Ahmet Arslan created SOLR-3779: -- Summary: LineEntityProcessor processes only one document Key: SOLR-3779 URL: https://issues.apache.org/jira/browse/SOLR-3779 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 4.0-BETA Reporter: Ahmet Arslan Fix For: 4.0 LineEntityProcessor processes only one document when combined with FileListEntityProcessor. {code:xml} dataConfig dataSource type=FileDataSource encoding=UTF-8 name=fds/ document entity name=f processor=FileListEntityProcessor fileName=.*txt baseDir=/Volumes/data/Documents recursive=false rootEntity=false dataSource=null transformer=TemplateTransformer entity onError=skip name=jc processor=LineEntityProcessor url=${f.fileAbsolutePath} dataSource=fds rootEntity=true transformer=TemplateTransformer field column=link template=hello${f.fileAbsolutePath},${jc.rawLine} / field column=rawLine name=rawLine / /entity /entity /document /dataConfig {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org