[Lucene.Net] [jira] [Updated] (LUCENENET-430) Contrib.ChainedFilter
[ https://issues.apache.org/jira/browse/LUCENENET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-430: --- Attachment: ChainedFilterTest.cs ChainedFilter.cs Contrib.ChainedFilter - Key: LUCENENET-430 URL: https://issues.apache.org/jira/browse/LUCENENET-430 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: ChainedFilter.cs, ChainedFilterTest.cs Port of lucene.Java 3.0.3's ChainedFilter its test cases. See the StackOverflow question: How to combine multiple filters within one search? http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Created] (LUCENENET-430) Contrib.ChainedFilter
Contrib.ChainedFilter - Key: LUCENENET-430 URL: https://issues.apache.org/jira/browse/LUCENENET-430 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: ChainedFilter.cs, ChainedFilterTest.cs Port of lucene.Java 3.0.3's ChainedFilter its test cases. See the StackOverflow question: How to combine multiple filters within one search? http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release PyLucene 3.3.0
Sorry, I should have included my errors 1st time around: In file included from build/_lucene/__wrap03__.cpp:514: build/_lucene/org/apache/lucene/search/grouping/AbstractSecondPassGroupingCollector$SearchGroupDocs.h:55: error: expected unqualified-id before '' token build/_lucene/org/apache/lucene/search/grouping/AbstractSecondPassGroupingCollector$SearchGroupDocs.h:55: error: expected ',' or '...' before '' token build/_lucene/__wrap03__.cpp:548: error: expected unqualified-id before '' token build/_lucene/__wrap03__.cpp:548: error: expected ',' or '...' before '' token build/_lucene/__wrap03__.cpp: In constructor 'org::apache::lucene::search::grouping::AbstractSecondPassGroupingCollector$SearchGroupDocs::AbstractSecondPassGroupingCollector$SearchGroupDocs()': build/_lucene/__wrap03__.cpp:548: error: 'a0' was not declared in this scope build/_lucene/__wrap03__.cpp:548: error: 'a1' was not declared in this scope build/_lucene/__wrap03__.cpp:548: error: 'a2' was not declared in this scope build/_lucene/__wrap03__.cpp: In function 'int org::apache::lucene::search::grouping::t_AbstractSecondPassGroupingCollector$SearchGroupDocs_init_(org::apache::lucene::search::grouping::t_AbstractSecondPassGroupingCollector$SearchGroupDocs*, PyObject*, PyObject*)': build/_lucene/__wrap03__.cpp:653: error: 'AbstractSecondPassGroupingCollector' is not a member of 'org::apache::lucene::search::grouping' build/_lucene/__wrap03__.cpp:653: error: expected `;' before 'a0' build/_lucene/__wrap03__.cpp:660: error: 'org::apache::lucene::search::grouping::AbstractSecondPassGroupingCollector' has not been declared build/_lucene/__wrap03__.cpp:660: error: 'a0' was not declared in this scope build/_lucene/__wrap03__.cpp:660: error: 'org::apache::lucene::search::grouping::t_AbstractSecondPassGroupingCollector' has not been declared error: command 'gcc-4.2' failed with exit status 1 make: *** [compile] Error 1 My env is OS X 10.6.6, Apple's build of Python (2.6.1), Java 1.6.0_22. Mike McCandless http://blog.mikemccandless.com On Sun, Jul 3, 2011 at 12:17 PM, Andi Vajda va...@apache.org wrote: Hi Mike, On Sun, 3 Jul 2011, Michael McCandless wrote: Re-send, this time to pylucene-dev: Everything looks good -- I was able to compile, run all tests successfully, and run my usual smoke test (indexing optimizing searching on first 100K wikipedia docs), but... I then tried to enable the grouping module (lucene/contrib/grouping), by adding a GROUPING_JAR matching all the other contrib jars, and running make. This then hit various compilation errors -- is anyone able to enable the grouping module and compile successfully? What kind of errors ? So I added the grouping module to the PyLucene branch_3x build and it just built (tm). I even committed the change to the build (rev 1142455) but I didn't check that the grouping module was functional in PyLucene as I didn't port any unit tests or even know much about it. Andi.. Mike McCandless http://blog.mikemccandless.com On Sun, Jul 3, 2011 at 10:14 AM, Michael McCandless luc...@mikemccandless.com wrote: Everything looks good -- I was able to compile, run all tests successfully, and run my usual smoke test (indexing optimizing searching on first 100K wikipedia docs), but... I then tried to enable the grouping module (lucene/contrib/grouping), by adding a GROUPING_JAR matching all the other contrib jars, and running make. This then hit various compilation errors -- is anyone able to enable the grouping module and compile successfully? Mike McCandless http://blog.mikemccandless.com On Fri, Jul 1, 2011 at 8:24 AM, Andi Vajda va...@apache.org wrote: The PyLucene 3.3.0-1 release closely tracking the recent release of Lucene Java 3.3 is ready. A release candidate is available from: http://people.apache.org/~vajda/staging_area/ A list of changes in this release can be seen at: http://svn.apache.org/repos/asf/lucene/pylucene/branches/pylucene_3_3/CHANGES PyLucene 3.3.0 is built with JCC 2.9 included in these release artifacts. A list of Lucene Java changes can be seen at: http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/lucene/CHANGES.txt Please vote to release these artifacts as PyLucene 3.3.0-1. Thanks ! Andi.. ps: the KEYS file for PyLucene release signing is at: http://svn.apache.org/repos/asf/lucene/pylucene/dist/KEYS http://people.apache.org/~vajda/staging_area/KEYS pps: here is my +1
Problems building JCC
Hi, This is likely another faq but, I've moved to a windows 7 machine (64bit) and trying to compile jcc. mingw32 compiler, JDK, JRE installed. I'm getting a libjcc.a - No such file or directory error. javac is available at the command prompt. Building with: python setup.py build --compiler=mingw32 Any help highly appriciated. /Petrus writing build\temp.win32-2.6\Release\jcc\sources\jcc.def C:\Program Files (x86)\pythonxy\mingw\bin\g++.exe -mno-cygwin -mdll -static --en try _DllMain@12 -Wl,--out-implib,build\lib.win32-2.6\jcc\jcc.lib --output-lib bu ild\temp.win32-2.6\Release\jcc\sources\libjcc.a --def build\temp.win32-2.6\Relea se\jcc\sources\jcc.def -s build\temp.win32-2.6\Release\jcc\sources\jcc.o build\t emp.win32-2.6\Release\jcc\sources\jccenv.o -LC:\Python26\libs -LC:\Python26\PCbu ild -lpython26 -lmsvcr90 -o build\lib.win32-2.6\jcc.dll -LC:\Program Files (x86 )\Java\jdk1.6.0_26/lib -ljvm -Wl,-S -Wl,--out-implib,jcc\jcc.lib g++: build\temp.win32-2.6\Release\jcc\sources\libjcc.a: No such file or director y error: command 'g++' failed with exit status 1
[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser
[ https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phillipe Ramalho updated LUCENE-2979: - Attachment: LUCENE-2979_phillipe_ramalho_2.patch As Adriano asked me, here is the first patch ready to be committed. It includes javadoc and package.html and overview.html updated based on the changes I made to the code. I am still working on integrating the new API with the old API. Simplify configuration API of contrib Query Parser -- Key: LUCENE-2979 URL: https://issues.apache.org/jira/browse/LUCENE-2979 Project: Lucene - Java Issue Type: Improvement Components: modules/other Affects Versions: 2.9, 3.0 Reporter: Adriano Crestani Assignee: Adriano Crestani Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor Fix For: 3.4, 4.0 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, LUCENE-2979_phillipe_reamalho.patch The current configuration API is very complicated and inherit the concept used by Attribute API to store token information in token streams. However, the requirements for both (QP config and token stream) are not the same, so they shouldn't be using the same thing. I propose to simplify QP config and make it less scary for people intending to use contrib QP. The task is not difficult, it will just require a lot of code change and figure out the best way to do it. That's why it's a good candidate for a GSoC project. I would like to hear good proposals about how to make the API more friendly and less scaring :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9300 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9300/ 1 tests failed. REGRESSION: org.apache.solr.client.solrj.TestLBHttpSolrServer.testSimple Error Message: expected:3 but was:2 Stack Trace: junit.framework.AssertionFailedError: expected:3 but was:2 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1430) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1348) at org.apache.solr.client.solrj.TestLBHttpSolrServer.testSimple(TestLBHttpSolrServer.java:127) Build Log (for compile errors): [...truncated 7907 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser
[ https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059329#comment-13059329 ] Adriano Crestani commented on LUCENE-2979: -- Hi Phillipe, thanks for the patch. However, as you did many changes to javadocs, I decided to run ant javadocs and it fails. It seems your patch references many times the constants in StandardQueryConfigHandler.ConfigurationKeys using @see tag, unfortunately you forgot to create a javadoc for those constants and it's causing the ant script to fail. Please, add these missing javadocs, run ant javadocs on contrib/queryparser to check if it finishes successfully and then submit a new patch. Besides that, great job, tests are running fine even after your big change :) Thanks! Simplify configuration API of contrib Query Parser -- Key: LUCENE-2979 URL: https://issues.apache.org/jira/browse/LUCENE-2979 Project: Lucene - Java Issue Type: Improvement Components: modules/other Affects Versions: 2.9, 3.0 Reporter: Adriano Crestani Assignee: Adriano Crestani Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor Fix For: 3.4, 4.0 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, LUCENE-2979_phillipe_reamalho.patch The current configuration API is very complicated and inherit the concept used by Attribute API to store token information in token streams. However, the requirements for both (QP config and token stream) are not the same, so they shouldn't be using the same thing. I propose to simplify QP config and make it less scary for people intending to use contrib QP. The task is not difficult, it will just require a lot of code change and figure out the best way to do it. That's why it's a good candidate for a GSoC project. I would like to hear good proposals about how to make the API more friendly and less scaring :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ
[ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059333#comment-13059333 ] Ahmet Arslan commented on SOLR-1499: Lance, I used it once to upgrade. SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ - Key: SOLR-1499 URL: https://issues.apache.org/jira/browse/SOLR-1499 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Lance Norskog Fix For: 3.4, 4.0 Attachments: SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch The SolrEntityProcessor queries an external Solr instance. The Solr documents returned are unpacked and emitted as DIH fields. The SolrEntityProcessor uses the following attributes: * solr='http://localhost:8983/solr/sms' ** This gives the URL of the target Solr instance. *** Note: the connection to the target Solr uses the binary SolrJ format. * query='Jeffersonsort=id+asc' ** This gives the base query string use with Solr. It can include any standard Solr request parameter. This attribute is processed under the variable resolution rules and can be driven in an inner stage of the indexing pipeline. * rows='10' ** This gives the number of rows to fetch per request.. ** The SolrEntityProcessor always fetches every document that matches the request.. * fields='id,tag' ** This selects the fields to be returned from the Solr request. ** These must also be declared as field elements. ** As with all fields, template processors can be used to alter the contents to be passed downwards. * timeout='30' ** This limits the query to 5 seconds. This can be used as a fail-safe to prevent the indexing session from freezing up. By default the timeout is 5 minutes. Limitations: * Solr errors are not handled correctly. * Loop control constructs have not been tested. * Multi-valued returned fields have not been tested. The unit tests give examples of how to use it as the root entity and an inner entity. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male updated LUCENE-3273: --- Attachment: LUCENE-3273.patch Patch with first shot at this. - MockQueryParser is introduced. It handles a very simple syntax consisting of boolean operators and can identify Wildcard queries. Cannot handle complex BooleanQuerys, boosts or PhraseQuerys. - QueryBuilderHelper is introduced which provides some utilities methods for building queries (currently just to create a TermQuery with a boost) - BooleanQueryBuilder and PhraseQueryBuilder are introduced to ease the process of programmatically creating complex BooleanQuerys and PhraseQuerys. - All core Lucene tests (apart from those in the queryparser package) have been moved away from relying on QueryParser. In extremely trivial situations, TermQuerys are now directly instantiated. In others, the MockQueryParser is used. In complex scenarios, the Builder classes are used to programmatically create the queries. - Some tests have been split up and moved around. Tests that did both parsing assertions and search assertions have been split so the parsing assertions go into TestQueryParser (since they are testing the QPs supported language). Next step is to visit the contrib tests and clear those out too, so we can prevent any back dependencies on the queryparser module. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query
[ https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059371#comment-13059371 ] Koji Sekiguchi commented on SOLR-2632: -- {quote} http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name For this query, highlighting does not work. Specifying hl.fl or not, does not influence the result. The result is: lst name=highlighting lst name=GB18030TEST/ lst name=UTF8TEST/ /lst {quote} This request creates a BooleanQuery that is composed of TermQuery(inStock, true) and BoostedQuery. Lucene's Highlighter knows TermQuery but doesn't know how to deal with Solr's BoostedQuery. The BoostedQuery should include TermQuery(name,test) that you want to hihglight, but Lucene doesn't care BoostedQuery, so Highlighter ignores entire BoostedQuery. Highlighting does not work for embedded boost query that boosts a dismax query -- Key: SOLR-2632 URL: https://issues.apache.org/jira/browse/SOLR-2632 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4.1, 3.2, 3.3 Environment: Linux. Reproduced in different machines with different Linux distributions and different JDK's. Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2. Reporter: Juan Antonio Farré Basurte Priority: Minor Labels: _query_, boost, dismax, edismax, embedded, highlighting, hl.fl, query I need to issue a dismax query, with date boost (I'd like to use the multiplicative boost provided by boost queries) and also filtering for other fields with too many possible distinct values to fit in a filter query. To achieve this, I use the boost query as a nested query using the pseudofield _query_. I also need highlighting for the fields used in the dismax query, but highlighting does not work. If I just use the boosted dismax query without embedding it inside another query, it works correctly. If I use bf instead of a boost query, and embed directly the dismax query, it works too, but hl.fl needs to be specified. It's a bit complicated to explain, so, I'll give examples using the example data that comes with solr (the problem is reproducible in the example solr distribution, not only in my concrete project). http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name For this query, highlighting does not work. Specifying hl.fl or not, does not influence the result. The result is: lst name=highlighting lst name=GB18030TEST/ lst name=UTF8TEST/ /lst http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This doesn't work either. Same result. http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true In this case, hightlighting works correctly: lst name=highlighting lst name=GB18030TEST arr name=name stremTest/em with some GB18030 encoded characters/str /arr /lst lst name=UTF8TEST arr name=name stremTest/em with some UTF-8 encoded characters/str /arr /lst /lst http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This also works. Same result as before. But in this case hl.fl is needed. Without it, highlighting does not work, either. Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query
[ https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059375#comment-13059375 ] Juan Antonio Farré Basurte commented on SOLR-2632: -- Sounds logical, but... if highlighter doesn't know how to deal with BoostedQuery, then why does it work when I issue the boosted query alone, without embedding it in the boolean query? May be I'm wrong, but it looks to me more like a problem of embedding the boosted query into the boolean query than a problem with boosted query itself. In fact, as you can see in my examples, if I directly embed the dismax query (without boost query) in the boolean query, it works, but it requires specifying hl.fl, when I believe it should just use the qf. My feeling is that the highlighter has problems dealing with embedded queries. The problems go worse if you embed boosted queries. Highlighting does not work for embedded boost query that boosts a dismax query -- Key: SOLR-2632 URL: https://issues.apache.org/jira/browse/SOLR-2632 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4.1, 3.2, 3.3 Environment: Linux. Reproduced in different machines with different Linux distributions and different JDK's. Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2. Reporter: Juan Antonio Farré Basurte Priority: Minor Labels: _query_, boost, dismax, edismax, embedded, highlighting, hl.fl, query I need to issue a dismax query, with date boost (I'd like to use the multiplicative boost provided by boost queries) and also filtering for other fields with too many possible distinct values to fit in a filter query. To achieve this, I use the boost query as a nested query using the pseudofield _query_. I also need highlighting for the fields used in the dismax query, but highlighting does not work. If I just use the boosted dismax query without embedding it inside another query, it works correctly. If I use bf instead of a boost query, and embed directly the dismax query, it works too, but hl.fl needs to be specified. It's a bit complicated to explain, so, I'll give examples using the example data that comes with solr (the problem is reproducible in the example solr distribution, not only in my concrete project). http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name For this query, highlighting does not work. Specifying hl.fl or not, does not influence the result. The result is: lst name=highlighting lst name=GB18030TEST/ lst name=UTF8TEST/ /lst http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This doesn't work either. Same result. http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true In this case, hightlighting works correctly: lst name=highlighting lst name=GB18030TEST arr name=name stremTest/em with some GB18030 encoded characters/str /arr /lst lst name=UTF8TEST arr name=name stremTest/em with some UTF-8 encoded characters/str /arr /lst /lst http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This also works. Same result as before. But in this case hl.fl is needed. Without it, highlighting does not work, either. Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests
[ https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3269: Attachment: LUCENE-3269.patch here's a patch that speeds up the slowest ones a bit (doesn't really solve the problem, but helps as a step) Speed up Top-K sampling tests - Key: LUCENE-3269 URL: https://issues.apache.org/jira/browse/LUCENE-3269 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3269.patch speed up the top-k sampling tests (but make sure they are thorough on nightly etc still) usually we would do this with use of atLeast(), but these tests are somewhat tricky, so maybe a different approach is needed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059381#comment-13059381 ] Simon Willnauer commented on LUCENE-2878: - Mike, its so awesome that you help here. I will be back on wednesday and post comments / suggestions then. simon Allow Scorer to expose positions and payloads aka. nuke spans -- Key: LUCENE-2878 URL: https://issues.apache.org/jira/browse/LUCENE-2878 Project: Lucene - Java Issue Type: Improvement Components: core/search Affects Versions: Bulk Postings branch Reporter: Simon Willnauer Assignee: Simon Willnauer Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2878-OR.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878_trunk.patch, LUCENE-2878_trunk.patch, PosHighlighter.patch, PosHighlighter.patch Currently we have two somewhat separate types of queries, the one which can make use of positions (mainly spans) and payloads (spans). Yet Span*Query doesn't really do scoring comparable to what other queries do and at the end of the day they are duplicating lot of code all over lucene. Span*Queries are also limited to other Span*Query instances such that you can not use a TermQuery or a BooleanQuery with SpanNear or anthing like that. Beside of the Span*Query limitation other queries lacking a quiet interesting feature since they can not score based on term proximity since scores doesn't expose any positional information. All those problems bugged me for a while now so I stared working on that using the bulkpostings API. I would have done that first cut on trunk but TermScorer is working on BlockReader that do not expose positions while the one in this branch does. I started adding a new Positions class which users can pull from a scorer, to prevent unnecessary positions enums I added ScorerContext#needsPositions and eventually Scorere#needsPayloads to create the corresponding enum on demand. Yet, currently only TermQuery / TermScorer implements this API and other simply return null instead. To show that the API really works and our BulkPostings work fine too with positions I cut over TermSpanQuery to use a TermScorer under the hood and nuked TermSpans entirely. A nice sideeffect of this was that the Position BulkReading implementation got some exercise which now :) work all with positions while Payloads for bulkreading are kind of experimental in the patch and those only work with Standard codec. So all spans now work on top of TermScorer ( I truly hate spans since today ) including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother to implement the other codecs yet since I want to get feedback on the API and on this first cut before I go one with it. I will upload the corresponding patch in a minute. I also had to cut over SpanQuery.getSpans(IR) to SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk first but after that pain today I need a break first :). The patch passes all core tests (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3269) Speed up Top-K sampling tests
[ https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059388#comment-13059388 ] Shai Erera commented on LUCENE-3269: Patch looks good. One other idea I think we should try is to create the large indexes once per all Top-K tests extensions. There are several references to FacetTestBase.initIndex(), and I think that the TopK tests can create their indexes (which is the same) at @BeforeClass, perhaps all indexes per partition sizes that are tested, and then proceed with testing. I think that will cut away a large portion of the running time. Speed up Top-K sampling tests - Key: LUCENE-3269 URL: https://issues.apache.org/jira/browse/LUCENE-3269 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3269.patch speed up the top-k sampling tests (but make sure they are thorough on nightly etc still) usually we would do this with use of atLeast(), but these tests are somewhat tricky, so maybe a different approach is needed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests
[ https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3269: Attachment: LUCENE-3269.patch Hi Shai, here is an updated patch that achieves the same thing, now the tests don't create redundant indexes Speed up Top-K sampling tests - Key: LUCENE-3269 URL: https://issues.apache.org/jira/browse/LUCENE-3269 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3269.patch, LUCENE-3269.patch speed up the top-k sampling tests (but make sure they are thorough on nightly etc still) usually we would do this with use of atLeast(), but these tests are somewhat tricky, so maybe a different approach is needed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3269) Speed up Top-K sampling tests
[ https://issues.apache.org/jira/browse/LUCENE-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3269: Attachment: LUCENE-3269.patch One more tweak, this one seems to help a lot, allows subclasses to tweak the IWConfig (we use the same trick here that we use for NRQ tests to prevent really slow behavior for such large indexes) Speed up Top-K sampling tests - Key: LUCENE-3269 URL: https://issues.apache.org/jira/browse/LUCENE-3269 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3269.patch, LUCENE-3269.patch, LUCENE-3269.patch speed up the top-k sampling tests (but make sure they are thorough on nightly etc still) usually we would do this with use of atLeast(), but these tests are somewhat tricky, so maybe a different approach is needed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3268) TestScoredDocIDsUtils.testWithDeletions test failure
[ https://issues.apache.org/jira/browse/LUCENE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059404#comment-13059404 ] Robert Muir commented on LUCENE-3268: - Hi Shai, I found another fail in this test: ant test -Dtestcase=TestScoredDocIDsUtils -Dtestmethod=testWithDeletions -Dtests.seed=-203625378244176964:-5047330594665853233 TestScoredDocIDsUtils.testWithDeletions test failure Key: LUCENE-3268 URL: https://issues.apache.org/jira/browse/LUCENE-3268 Project: Lucene - Java Issue Type: Bug Components: modules/facet Reporter: Robert Muir Assignee: Shai Erera Fix For: 3.4, 4.0 ant test -Dtestcase=TestScoredDocIDsUtils -Dtestmethod=testWithDeletions -Dtests.seed=-2216133137948616963:2693740419732273624 -Dtests.multiplier=5 In general, on both 3.x and trunk, if you run this test with -Dtests.iter=100 it tends to fail 2% of the time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3275) hang on 1.6.0u26
hang on 1.6.0u26 Key: LUCENE-3275 URL: https://issues.apache.org/jira/browse/LUCENE-3275 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir on the mac, a system update pushed out an upgrade to 1.6.0u26 basically, if i run 'ant test' from the faceting module, my jre completely hangs (0% cpu, won't even respond to kill -QUIT to print a stacktrace). This is reproducable... it always happens inside SamplingAccumulatorTest. Of course if i run this test by itself, or anything else, it doesn't want to hang... but you should be able to reproduce by running 'ant test -Dtests.threadspercpu=0' which runs all tests sequentially. Acts like http://forums.oracle.com/forums/thread.jspa?threadID=2246699 I think this JRE version (update 26) is broken. If your mac asks you to upgrade, just say no. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3268) TestScoredDocIDsUtils.testWithDeletions test failure
[ https://issues.apache.org/jira/browse/LUCENE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059450#comment-13059450 ] Shai Erera commented on LUCENE-3268: Committed revision 1142675 (3x). Committed revision 1142676 (trunk). TestScoredDocIDsUtils.testWithDeletions test failure Key: LUCENE-3268 URL: https://issues.apache.org/jira/browse/LUCENE-3268 Project: Lucene - Java Issue Type: Bug Components: modules/facet Reporter: Robert Muir Assignee: Shai Erera Fix For: 3.4, 4.0 ant test -Dtestcase=TestScoredDocIDsUtils -Dtestmethod=testWithDeletions -Dtests.seed=-2216133137948616963:2693740419732273624 -Dtests.multiplier=5 In general, on both 3.x and trunk, if you run this test with -Dtests.iter=100 it tends to fail 2% of the time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1932) add relevancy function queries
[ https://issues.apache.org/jira/browse/SOLR-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059453#comment-13059453 ] Yonik Seeley commented on SOLR-1932: Hmm, yeah, I didn't even know about Terms.getSumTotalTermFreq! add relevancy function queries -- Key: SOLR-1932 URL: https://issues.apache.org/jira/browse/SOLR-1932 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Priority: Minor Fix For: 4.0 Attachments: SOLR-1932.patch, SOLR-1932_totaltermfreq.patch Add function queries for relevancy factors such as tf, idf, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #171: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/171/ No tests ran. Build Log (for compile errors): [...truncated 7447 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query
[ https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059462#comment-13059462 ] Koji Sekiguchi commented on SOLR-2632: -- bq. What I'm not sure is about the conclusion. Is this a bug that should be corrected? I'm not sure. If getHighlightQuery() is for providing basic query so that Lucene's highlighter can understand what kind of query it is, it looks bug to me. BTW, how do you think the idea of SOLR-1926. If it can be used, does it solve your problem? Highlighting does not work for embedded boost query that boosts a dismax query -- Key: SOLR-2632 URL: https://issues.apache.org/jira/browse/SOLR-2632 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4.1, 3.2, 3.3 Environment: Linux. Reproduced in different machines with different Linux distributions and different JDK's. Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2. Reporter: Juan Antonio Farré Basurte Priority: Minor Labels: _query_, boost, dismax, edismax, embedded, highlighting, hl.fl, query I need to issue a dismax query, with date boost (I'd like to use the multiplicative boost provided by boost queries) and also filtering for other fields with too many possible distinct values to fit in a filter query. To achieve this, I use the boost query as a nested query using the pseudofield _query_. I also need highlighting for the fields used in the dismax query, but highlighting does not work. If I just use the boosted dismax query without embedding it inside another query, it works correctly. If I use bf instead of a boost query, and embed directly the dismax query, it works too, but hl.fl needs to be specified. It's a bit complicated to explain, so, I'll give examples using the example data that comes with solr (the problem is reproducible in the example solr distribution, not only in my concrete project). http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name For this query, highlighting does not work. Specifying hl.fl or not, does not influence the result. The result is: lst name=highlighting lst name=GB18030TEST/ lst name=UTF8TEST/ /lst http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This doesn't work either. Same result. http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true In this case, hightlighting works correctly: lst name=highlighting lst name=GB18030TEST arr name=name stremTest/em with some GB18030 encoded characters/str /arr /lst lst name=UTF8TEST arr name=name stremTest/em with some UTF-8 encoded characters/str /arr /lst /lst http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This also works. Same result as before. But in this case hl.fl is needed. Without it, highlighting does not work, either. Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2632) Highlighting does not work for embedded boost query that boosts a dismax query
[ https://issues.apache.org/jira/browse/SOLR-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059471#comment-13059471 ] Juan Antonio Farré Basurte commented on SOLR-2632: -- Interesting idea. For my concrete problem, it would probably provide a workaround, yes. The comment by Hoss Man sounds also quite reasonable. I can't think of a situation where having hl.q provides a clear advantage over the hl.text suggested by Hoss Man, though may be I just haven't come up with the use case. Highlighting does not work for embedded boost query that boosts a dismax query -- Key: SOLR-2632 URL: https://issues.apache.org/jira/browse/SOLR-2632 Project: Solr Issue Type: Bug Components: highlighter Affects Versions: 1.4.1, 3.2, 3.3 Environment: Linux. Reproduced in different machines with different Linux distributions and different JDK's. Solr 3.3 and Lucidworks for solr 1.4.1 and 3.2. Reporter: Juan Antonio Farré Basurte Priority: Minor Labels: _query_, boost, dismax, edismax, embedded, highlighting, hl.fl, query I need to issue a dismax query, with date boost (I'd like to use the multiplicative boost provided by boost queries) and also filtering for other fields with too many possible distinct values to fit in a filter query. To achieve this, I use the boost query as a nested query using the pseudofield _query_. I also need highlighting for the fields used in the dismax query, but highlighting does not work. If I just use the boosted dismax query without embedding it inside another query, it works correctly. If I use bf instead of a boost query, and embed directly the dismax query, it works too, but hl.fl needs to be specified. It's a bit complicated to explain, so, I'll give examples using the example data that comes with solr (the problem is reproducible in the example solr distribution, not only in my concrete project). http://localhost:8983/solr/select?q=%2binStock:true%20%2b_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name For this query, highlighting does not work. Specifying hl.fl or not, does not influence the result. The result is: lst name=highlighting lst name=GB18030TEST/ lst name=UTF8TEST/ /lst http://localhost:8983/solr/select?q=_query_:%22{!boost%20b=$dateboost%20v=$qq%20defType=dismax}%22qq=testqf=namedateboost=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This doesn't work either. Same result. http://localhost:8983/solr/select?q={!boost b=$dateboost v=$qq defType=dismax}qq=testqf=namedateboost=recip(ms(NOW,last_modified),3.16e-11,1,1)hl=true In this case, hightlighting works correctly: lst name=highlighting lst name=GB18030TEST arr name=name stremTest/em with some GB18030 encoded characters/str /arr /lst lst name=UTF8TEST arr name=name stremTest/em with some UTF-8 encoded characters/str /arr /lst /lst http://localhost:8983/solr/select?q=%2BinStock:true%20%2B_query_:%22{!dismax%20v=$qq}%22qq=testqf=namebf=recip%28ms%28NOW,last_modified%29,3.16e-11,1,1%29hl=truehl.fl=name This also works. Same result as before. But in this case hl.fl is needed. Without it, highlighting does not work, either. Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-1932) add relevancy function queries
[ https://issues.apache.org/jira/browse/SOLR-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-1932: --- Attachment: SOLR-1932_sumtotaltermfreq.patch Here's an update that includes sumtotaltermfreq and aliases totaltermfreq to ttf and sumtotaltermfreq to sttf. add relevancy function queries -- Key: SOLR-1932 URL: https://issues.apache.org/jira/browse/SOLR-1932 Project: Solr Issue Type: New Feature Reporter: Yonik Seeley Priority: Minor Fix For: 4.0 Attachments: SOLR-1932.patch, SOLR-1932_sumtotaltermfreq.patch, SOLR-1932_totaltermfreq.patch Add function queries for relevancy factors such as tf, idf, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3220) Implement various ranking models as Similarities
[ https://issues.apache.org/jira/browse/LUCENE-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mark Nemeskey updated LUCENE-3220: Attachment: LUCENE-3220.patch Fixed a few things in MockBM25Similarity. Implement various ranking models as Similarities Key: LUCENE-3220 URL: https://issues.apache.org/jira/browse/LUCENE-3220 Project: Lucene - Java Issue Type: Sub-task Components: core/search Affects Versions: flexscoring branch Reporter: David Mark Nemeskey Assignee: David Mark Nemeskey Labels: gsoc Attachments: LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch Original Estimate: 336h Remaining Estimate: 336h With [LUCENE-3174|https://issues.apache.org/jira/browse/LUCENE-3174] done, we can finally work on implementing the standard ranking models. Currently DFR, BM25 and LM are on the menu. TODO: * {{EasyStats}}: contains all statistics that might be relevant for a ranking algorithm * {{EasySimilarity}}: the ancestor of all the other similarities. Hides the DocScorers and as much implementation detail as possible * _BM25_: the current mock implementation might be OK * _LM_ * _DFR_ Done: -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
EmbeddedSolrServer
Hi, Shouldn't org.apache.solr.client.solrj.embedded.EmbeddedSolrServer class be located under https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/solrj; instead of https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src; ? Thanks, Clécio - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2631) PingRequestHandler can infinite loop if called with a qt that points to itsself
[ https://issues.apache.org/jira/browse/SOLR-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2631. Resolution: Fixed Uwe, sorry for my brevity -- my point was that you had fixed the infinite loop by adding an sanity check that will throw an error, but the example test configs should also be improved to demonstrate better practices when using the PingRequestHandler so people using them can never encounter the sanity checking you added. Committed revision 1142722. - trunk Committed revision 1142730. - trunk stupid mistake Committed revision 1142731. - 3x PingRequestHandler can infinite loop if called with a qt that points to itsself --- Key: SOLR-2631 URL: https://issues.apache.org/jira/browse/SOLR-2631 Project: Solr Issue Type: Bug Components: search, web gui Affects Versions: 1.4, 3.1, 3.2, 3.3 Reporter: Uwe Schindler Assignee: Uwe Schindler Labels: security Fix For: 3.4, 4.0 Attachments: SOLR-2631.patch We got a security report to priv...@lucene.apache.org, that Solr can infinite loop, use 100% CPU and stack overflow, if you execute the following HTTP request: - http://localhost:8983/solr/select?qt=/admin/ping - http://localhost:8983/solr/admin/ping?qt=/admin/ping The qt paramter instructs PingRequestHandler to call the given request handler. This leads to an infinite loop. This is not an security issue, but for an unprotected Solr server with unprotected /solr/select path this makes it stop working. The fix is to prevent infinite loop by disallowing calling itsself. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
failonjavadocwarning to false for ant generate-maven-artifacts
Hi, In current trunk, I had to set failonjavadocwarning to false to successfully generate the pom (via ant generate-maven-artifacts). (invoking ant javadoc in lucene folder also fails). I was simply looking for the pom.xml generation, but much more was done. I'm not worry about that (just willing to share it). Thx. -- Eric - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr - MOD function
Hi, I was looking for MOD function in SOLR, but I couldn´t find it. Is there any solution thas isn´t directly in SOLR or can you implement this funciton (if you can, so when?)? It´s very important function for our project. For example we need to search after the five-year, decade, etc. Regards Radek Majer
Re: revisit naming for grouping/join?
: In my example the city was parent -- I raised this example to explain : that index-time joining is more general than just nested docs (ie, I : think we should keep the name join for this module... also because : we should factor out more general search-time-only join capabilities : into it). i think that may be the wrong approach to take when discussing examples, while it's great to say there are dozens of usecases that these features can all support in dozens of diff ways we should relaly focus on naming/deming these use cases in the ways where they really make the most sense. In otherwords, i don't think we should say All of these types of problems are different types of nails, and all of these modules are specialty hammers that are slightly distinct from eachother in how they work, but you can use any of these hammers on any of these nails instead we should say here are some specialty hammers, you can use them for lots of types of nails, ut for each hammer here is the type of nail where it really shines block-index-join as i understand it requires all the docs you want to join up to be in one contigious range of docids in the index, so if you want to re-index one doc in a block you have to re-index the entire block -- so the city/doctor example doesn't sound like a good generic example of when/why to use this (because a doctor might change his office hours, or address -- maybe even leavong the city completely, while a city might change it's population w/o the doctor being affected at all. The book and pages example seems much more appropriate, since in the real world these things change in lock step -- pages aren't added/removed to a book; pages don't change w/o the book itself being fundementally changed. the fields of a page document are the text of that page, and that is inheriently data about the book -- the fields of a doctor document are metadata about the doctor, and that is not inheriently data about the city the doctor lives in. as for the name ... i understand why it's called module/join and i understand why the classes are called BlockJoinQuery and BlockJoinCollector but i don't think those names really stand out and convey to end users what they do and how/why they are useful. Personally i think better names would be modules/subdocuments, ParentDocumentQuery and ChildDocumentsCollector I know mcccandless isn't a fan of the name Nested Documents because this functionality *can* be used for use cases where the data being modeled is not strictly organized in a nested relationship, but that doesn't mean it's *optimal* or easy for a user to apply to other usecases, because they have to design their model (and their indexing strategy) in such a way that they think them as nested or hierarchical documents. Naming it module/subdocuments would not only emphasis the usecase where it really shines, it would more importantly draw attention to how users have to model their data in order to take advantage of it -- and using ParentDocument and ChildDocuments in the names of the Query/Collector would make it clear what they match on relative the underlying query that they wrap/collect it would also help distibguish from more general joins like what solr does today -- it seems like that should eventually take the name module/join At a minum we should rename what we have now modules/block-join or modules/index-join (but the later is confusing) and eventually add modules/query-join (yes, yes, block joins provide a query, btu the differnce is when you you have to make a decision about how you want to join your model, at index time or at query time. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3233) HuperDuperSynonymsFilter™
[ https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3233: --- Attachment: LUCENE-3233.patch New patch w/ current state. I think it's closer; the test has more cases now (but I'd still like to make a random test), fewer nocommits, etc. HuperDuperSynonymsFilter™ - Key: LUCENE-3233 URL: https://issues.apache.org/jira/browse/LUCENE-3233 Project: Lucene - Java Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-3223.patch, LUCENE-3233.patch, LUCENE-3233.patch The current synonymsfilter uses a lot of ram and cpu, especially at build time. I think yesterday I heard about huge synonyms files three times. So, I think we should use an FST-based structure, sharing the inputs and outputs. And we should be more efficient with the tokenStream api, e.g. using save/restoreState instead of cloneAttributes() -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059527#comment-13059527 ] Michael McCandless commented on LUCENE-2793: I think BufferedIndexInput doesn't need a set/getMergeBufferSize? Ie, BII only knows its bufferSize, regardless of the context from its parent. Otherwise I think your patch is good: today on trunk we hardwire the 4 KB buffer size for merges, which is the same thing your patch is doing; the only difference is the constant MERGE_BUFFER_SIZE has moved from IW to BII, and each Dir impl now has the if. As a future improvement we can add a set/getMergeBufferSize to each Dir impl... Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059531#comment-13059531 ] Hoss Man commented on LUCENE-3273: -- I'm in favor of eliminating the QueryParser dependency, but i feel like this approach of adding things like BooleanQueryBuilder leads us down the road towards tests that are so verbose in query construction it will draw attention away from the important parts of the test -- doing something with those queres. a while back when i wrote TestExplanations, i added a bunch of convenience methods for constructing esoteric queries that i couldn't get cleanly from the QueryParser (mainly spans) -- perhaps we should move towards generalizing that approach ... either in a Utility class where they can be staticly imported, or into LuceneTestCase? These days we could even use vargs for things like Phrase, Boolean, and SpanNear queries (we weren't using Java5 when i wrote the existing ones) That way instead of things like this... {code} PhraseQuery q = new PhraseQuery(); // Query this hi this is a test is q.add(new Term(field, hi), 1); q.add(new Term(field, test), 5); assertEquals(field:\? hi ? ? ? test\, q.toString()); {code} ...we could have ... {code} Query q = phraseQ(field, null, hi, null, null, null, test); assertEquals(field:\? hi ? ? ? test\, q.toString()); {code} And instead of this... {code} public void testDMQ8() throws Exception { DisjunctionMaxQuery q = new DisjunctionMaxQuery(0.5f); q.add(new BooleanQueryBuilder(FIELD) .addTermQuery(yy) .addQuery(QueryBuilderHelper.newTermQuery(FIELD, w5, 100)) .get()); q.add(QueryBuilderHelper.newTermQuery(FIELD, xx, 10)); qtest(q, new int[] { 0,2,3 }); } {code} ...we could have... {code} public void testDMQ8() throws Exception { DisjunctionMaxQuery q = new DisjunctionMaxQuery(0.5f); q.add(booleanQ(opt(termQ(FIELD, yy)), opt(termQ(FIELD, w5, 100; q.add(termQ(FIELD, xx, 10)); qtest(q, new int[] { 0,2,3 }); } {code} Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059532#comment-13059532 ] Robert Muir commented on LUCENE-3273: - With all due respect hoss, i'd rather have the former than the latter. The latter reminds me of solr tests which use this approach, I find them extremely painful to read. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: failonjavadocwarning to false for ant generate-maven-artifacts
Hi Eric, 'ant get-maven-poms' will generate the pom.xml files for you. 'ant generate-maven-artifacts' has to generate the javadoc for each module, and javadoc generation fails on warnings. When the javadoc tool fails to download the package list from Oracle, which seems to happen often, the resulting warning fails the build. Steve -Original Message- From: Eric Charles [mailto:eric.char...@u-mangate.com] Sent: Monday, July 04, 2011 5:07 AM To: dev@lucene.apache.org Subject: failonjavadocwarning to false for ant generate-maven-artifacts Hi, In current trunk, I had to set failonjavadocwarning to false to successfully generate the pom (via ant generate-maven-artifacts). (invoking ant javadoc in lucene folder also fails). I was simply looking for the pom.xml generation, but much more was done. I'm not worry about that (just willing to share it). Thx. -- Eric - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059533#comment-13059533 ] Hoss Man commented on LUCENE-3273: -- to each his own i guess. I just think it makes sense for utilities that do the banal stuff that's not central to the actually methods being tested should be as short as possible and get the hell out of the way -- the code you actually want to test should be verbose and catch your eye. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059537#comment-13059537 ] Robert Muir commented on LUCENE-3273: - the difference here, is that I think in general the tests should use/look like the API. this makes them readable for people (e.g. new contributors) who already know lucene's API to understand what the tests do. For example in the lucene tests we added various randomization, but we tried to make it look just like the API, except deleting a space: {noformat} new IndexWriterConfig() - newIndexWriterConfig() new Directory() - newDirectory() new Field() - newField() ... {noformat} in some of these tests, I think its actually *way more clear* to explicitly build the BQs and not use any builders or parsers, especially TestBoolean2 for example. I fear sometimes, people get caught up on more lines of code == bad. I think this is wrong, sometimes more lines of code is good. parsers, builder apis, and helper methods might reduce the number of lines of code, but they add additional layers and obfuscation that makes this a terrible tradeoff. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059538#comment-13059538 ] Michael McCandless commented on LUCENE-3273: I would also prefer to keep tests very straightforward, even if that makes them more verbose. Ie just use the Lucene core API, and if the core API is insufficient we should improve it. I don't think we should be adding very much special test-only APIs. In fact, why even add a builder here for BQ? Can't we just make the BQ and add the clauses? In general I'm not a fan of builder APIs... I think they are over-applied these days (hammer!) and I don't think we need it here for our tests. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3167) Make lucene/solr a OSGI bundle through Ant
[ https://issues.apache.org/jira/browse/LUCENE-3167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059540#comment-13059540 ] Luca Stancapiano commented on LUCENE-3167: -- Here a updated version using the correct classpath: property name=bndclasspath refid=classpath/ taskdef resource=aQute/bnd/ant/taskdef.properties / bnd classpath=${bndclasspath} eclipse=false failok=false exceptions=true files=${common.dir}/lucene.bnd / The ant classpath is different by the maven classpath so there are differences in the resulting 'Export-Package' variable in the MANIFEST.MF but both are ok Make lucene/solr a OSGI bundle through Ant -- Key: LUCENE-3167 URL: https://issues.apache.org/jira/browse/LUCENE-3167 Project: Lucene - Java Issue Type: New Feature Environment: bndtools Reporter: Luca Stancapiano We need to make a bundle thriugh Ant, so the binary can be published and no more need the download of the sources. Actually to get a OSGI bundle we need to use maven tools and build the sources. Here the reference for the creation of the OSGI bundle through Maven: https://issues.apache.org/jira/browse/LUCENE-1344 Bndtools could be used inside Ant -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3273: Attachment: LUCENE-3273_testboolean2.patch here's my example, TestBoolean2. in my opinion building the queries like this makes the test much more readable. it adds 48 lines and deletes 29 lines of code... I think adding these 19 lines of code to this 343 line test case is worth every penny, because its much easier to see what any given test does, e.g. just glance real quick at testQueries06 and you see its a BQ with one MUST and two MUST_NOTS, no parsing by the brain required. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: revisit naming for grouping/join?
OK I'm sold! I agree: let's rename this new module according to the most likely use case, not according to its logical function, and I agree nested documents is the compelling use case here. Then fully generic joins can go to a new module/join. Maybe modules/nesteddocuments (I think that's more descriptive than subdocuments)? How about NestedDocumentQuery? And NestedDocumentCollector? See, you can use NestedDocumentQuery but collect it with any ordinary collector if you don't care about the nesting (ie, you are only interested in matches in the parent document space). The NestedDocumentCollector also collects all the nested docs matching each parent hit. You can of course still use this Query/Collector for any kind of join, as long as your app is able to do this join at indexing time and index all joined docs to a single row of the primary table as a doc block. But this will presumably be a less common use case so I agree we should just name this feature according to its common use case. Mike McCandless http://blog.mikemccandless.com On Mon, Jul 4, 2011 at 1:34 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : In my example the city was parent -- I raised this example to explain : that index-time joining is more general than just nested docs (ie, I : think we should keep the name join for this module... also because : we should factor out more general search-time-only join capabilities : into it). i think that may be the wrong approach to take when discussing examples, while it's great to say there are dozens of usecases that these features can all support in dozens of diff ways we should relaly focus on naming/deming these use cases in the ways where they really make the most sense. In otherwords, i don't think we should say All of these types of problems are different types of nails, and all of these modules are specialty hammers that are slightly distinct from eachother in how they work, but you can use any of these hammers on any of these nails instead we should say here are some specialty hammers, you can use them for lots of types of nails, ut for each hammer here is the type of nail where it really shines block-index-join as i understand it requires all the docs you want to join up to be in one contigious range of docids in the index, so if you want to re-index one doc in a block you have to re-index the entire block -- so the city/doctor example doesn't sound like a good generic example of when/why to use this (because a doctor might change his office hours, or address -- maybe even leavong the city completely, while a city might change it's population w/o the doctor being affected at all. The book and pages example seems much more appropriate, since in the real world these things change in lock step -- pages aren't added/removed to a book; pages don't change w/o the book itself being fundementally changed. the fields of a page document are the text of that page, and that is inheriently data about the book -- the fields of a doctor document are metadata about the doctor, and that is not inheriently data about the city the doctor lives in. as for the name ... i understand why it's called module/join and i understand why the classes are called BlockJoinQuery and BlockJoinCollector but i don't think those names really stand out and convey to end users what they do and how/why they are useful. Personally i think better names would be modules/subdocuments, ParentDocumentQuery and ChildDocumentsCollector I know mcccandless isn't a fan of the name Nested Documents because this functionality *can* be used for use cases where the data being modeled is not strictly organized in a nested relationship, but that doesn't mean it's *optimal* or easy for a user to apply to other usecases, because they have to design their model (and their indexing strategy) in such a way that they think them as nested or hierarchical documents. Naming it module/subdocuments would not only emphasis the usecase where it really shines, it would more importantly draw attention to how users have to model their data in order to take advantage of it -- and using ParentDocument and ChildDocuments in the names of the Query/Collector would make it clear what they match on relative the underlying query that they wrap/collect it would also help distibguish from more general joins like what solr does today -- it seems like that should eventually take the name module/join At a minum we should rename what we have now modules/block-join or modules/index-join (but the later is confusing) and eventually add modules/query-join (yes, yes, block joins provide a query, btu the differnce is when you you have to make a decision about how you want to join your model, at index time or at query time. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands,
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059576#comment-13059576 ] Hoss Man commented on LUCENE-3273: -- bq. In general I'm not a fan of builder APIs... I think they are over-applied these days (hammer!) bq. I think adding these 19 lines of code to this 343 line test case is worth every penny, because its much easier to see what any given test does, e.g. just glance real quick at testQueries06 and you see its a BQ with one MUST and two MUST_NOTS, no parsing by the brain required. i don't disagree with either of you. particularly in this test where the whole point is testing BooleanQueries -- so lets actually have the test showing the construction of a BooleanQuery. my point was more about tests where the construction of the Query object is ancillary to what the test is actually for. that said: definitely in agreement that using the core api and constructing the queries right in the test leaves no room for ambiguity -- my main point was that if we're going to have builders to simplify the tests, let's make them short and terse like the QP syntax that use to be in those tests. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059579#comment-13059579 ] Robert Muir commented on LUCENE-3273: - after reviewing the core tests, I think there is really not that many tests using the queryparser at all. in fact it seems the only 'non-trivial' queries being built are inside the explanations tests (e.g. more than just a term, boolean, or phrase or whatever), if these are too laborious to make manually, maybe we can just have whatever is needed in the base TestExplanations... but I think it would be good to build queries directly in most places in general. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: revisit naming for grouping/join?
: Maybe modules/nesteddocuments (I think that's more descriptive than : subdocuments)? either way ... subdocuments has the advantage of being a shorter directory name. i kinda wonder about first impressions and the entomology of nested ... it makes me think of bird nests and russion dolls, neither of which really convey the point: nesting in birds is about protecting/incubating and is only a single layer; while russian nesting dolls are singular wrappers arround wrappers arround wrappers. subdocuments seems like it might better because it conveys more of a hierarchical nature (to me anyway). : How about NestedDocumentQuery? And NestedDocumentCollector? : : See, you can use NestedDocumentQuery but collect it with any ordinary : collector if you don't care about the nesting (ie, you are only : interested in matches in the parent document space). The : NestedDocumentCollector also collects all the nested docs matching : each parent hit. Hmmm... My suggestion of ParentDocumentQuery was based on the understanding that the simplest usecase was... Query inner = getSomethingThatMatchesSomeChildDocs(); Filter parents = someFilterThatMatcheAllKnownParentDocs() Query outer = new ParentDocumentQuery(inner, parents) TopDocs results = searcher.search(outer) ...and in this case results will contain the parents of the child documents that match inner. is that correct? if so, then indepenent of the Collector, ParentDocumentQuery (or ParentDocumentQueryWrapper) still seems like it makes the most sense. For the Collector, i realize now that i totally missunderstood it's api -- for some reason i thought it would wrap another Collector and proxy to the inner collector only the parents, independently collecting/recording the groups of parent-children info which could be asked for later. ChildDocumentsCollector definitely doesn't make ense -- it's not just collecting children, it's collecting Groups made up of parents and children ... GroupCollector is obviously too general though ... i would toss out ParentChildrenTopGroupCollector to make it clear that: a) what you can get out of it are instances of TopGroups b) the Groups consists of Parents and Children ...but that may be trying to convey too much in a classname. I certianly wouldn't complain about NestedDocumentCollector or SubDocumentCollector if people like those better. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2565) Prevent IW#close and cut over to IW#commit
[ https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059587#comment-13059587 ] Mark Miller commented on SOLR-2565: --- I've still got to put a note in changes about how you should reload SolrCores after this change. Prevent IW#close and cut over to IW#commit -- Key: SOLR-2565 URL: https://issues.apache.org/jira/browse/SOLR-2565 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-2565.patch Spinnoff from SOLR-2193. We already have a branch to work on this issue here https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 The main goal here is to prevent solr from closing the IW and use IW#commit instead. AFAIK the main issues here are: The update handler needs an overhaul. A few goals I think we might want to look at: 1. Expose the SolrIndexWriter in the api or add the proper abstractions to get done what we now do with special casing: 2. Stop closing the IndexWriter and start using commit (still lazy IW init though). 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. 4. Address the current issues we face because multiple original/'reloaded' cores can have a different IndexWriter on the same index. Eventually this is a preparation for NRT support in Solr which I will create a followup issue for. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2308) Separately specify a field's type
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Tankovic updated LUCENE-2308: Attachment: LUCENE-2308-5.patch Some test are cutover, more to come... This fifth patch is to monitor progress, and see if something is wrong, or could be better. Cutover InstantiatedDocument along the way also Separately specify a field's type - Key: LUCENE-2308 URL: https://issues.apache.org/jira/browse/LUCENE-2308 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, LUCENE-2308.patch, LUCENE-2308.patch This came up from dicussions on IRC. I'm summarizing here... Today when you make a Field to add to a document you can set things index or not, stored or not, analyzed or not, details like omitTfAP, omitNorms, index term vectors (separately controlling offsets/positions), etc. I think we should factor these out into a new class (FieldType?). Then you could re-use this FieldType instance across multiple fields. The Field instance would still hold the actual value. We could then do per-field analyzers by adding a setAnalyzer on the FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise for per-field codecs (with flex), where we now have PerFieldCodecWrapper). This would NOT be a schema! It's just refactoring what we already specify today. EG it's not serialized into the index. This has been discussed before, and I know Michael Busch opened a more ambitious (I think?) issue. I think this is a good first baby step. We could consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold off on that for starters... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9328 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9328/ All tests passed Build Log (for compile errors): [...truncated 17547 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2308) Separately specify a field's type
[ https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059628#comment-13059628 ] Michael McCandless commented on LUCENE-2308: Patch looks good Nikola -- I'll commit it to the branch! I removed the 2 nocommits from oal.document2.Document -- I think they were leftover from copying from Document. Separately specify a field's type - Key: LUCENE-2308 URL: https://issues.apache.org/jira/browse/LUCENE-2308 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2308-2.patch, LUCENE-2308-3.patch, LUCENE-2308-4.patch, LUCENE-2308-4.patch, LUCENE-2308-5.patch, LUCENE-2308.patch, LUCENE-2308.patch This came up from dicussions on IRC. I'm summarizing here... Today when you make a Field to add to a document you can set things index or not, stored or not, analyzed or not, details like omitTfAP, omitNorms, index term vectors (separately controlling offsets/positions), etc. I think we should factor these out into a new class (FieldType?). Then you could re-use this FieldType instance across multiple fields. The Field instance would still hold the actual value. We could then do per-field analyzers by adding a setAnalyzer on the FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise for per-field codecs (with flex), where we now have PerFieldCodecWrapper). This would NOT be a schema! It's just refactoring what we already specify today. EG it's not serialized into the index. This has been discussed before, and I know Michael Busch opened a more ambitious (I think?) issue. I think this is a good first baby step. We could consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold off on that for starters... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: EmbeddedSolrServer
it is a bit weird, but we don't want solrj to depend on solr-core (it is a client library, that should not need to know anything about lucene/solr) It might make sense to put EmbeddedSolrServer in its own source tree/.jar but for the size/complexity, it seemed easiest to just put in the package that already had the right dependencies. ryan On Mon, Jul 4, 2011 at 12:30 PM, Clecio Varjao cleciovar...@gmail.com wrote: Hi, Shouldn't org.apache.solr.client.solrj.embedded.EmbeddedSolrServer class be located under https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/solrj; instead of https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src; ? Thanks, Clécio - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3275) SamplingAccumulatorTest hangs on Java 1.6.0u26
[ https://issues.apache.org/jira/browse/LUCENE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated LUCENE-3275: - Summary: SamplingAccumulatorTest hangs on Java 1.6.0u26 (was: hang on 1.6.0u26) Robert: did you file a new bug with oracle? If the hypothesis of the reporter for the bug you linked to is correct, then it's not likely to be the same bug -- in that case the debug info for the process suggested that it was hung getting translation info as part of a call to GraphicsEnvironment.getLocalGraphicsEnvironment().getAvailableFontFamilyNames() while running in headless mode. none of which sound like they are likely to happen in one of our tests. that report was also filed by someone working on proprietary code who couldn't post a reproducible test case. If you can post jvm info, os info, and a lucene svn r# that reliable reproduces orcale would have a much better bug report to work with. SamplingAccumulatorTest hangs on Java 1.6.0u26 -- Key: LUCENE-3275 URL: https://issues.apache.org/jira/browse/LUCENE-3275 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir on the mac, a system update pushed out an upgrade to 1.6.0u26 basically, if i run 'ant test' from the faceting module, my jre completely hangs (0% cpu, won't even respond to kill -QUIT to print a stacktrace). This is reproducable... it always happens inside SamplingAccumulatorTest. Of course if i run this test by itself, or anything else, it doesn't want to hang... but you should be able to reproduce by running 'ant test -Dtests.threadspercpu=0' which runs all tests sequentially. Acts like http://forums.oracle.com/forums/thread.jspa?threadID=2246699 I think this JRE version (update 26) is broken. If your mac asks you to upgrade, just say no. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059642#comment-13059642 ] Chris Male commented on LUCENE-3273: In defence of builders, its a great design pattern and I don't agree that its over-applied. With all that said, I'll move away from them. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch, LUCENE-3273_testboolean2.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Solr-3.x - Build # 394 - Failure
Hi, I am having similar problems. When running ant javadocs on contrib/queryparser, I get the following error: javadoc: warning - Error fetching URL: http://java.sun.com/j2se/1.6/docs/api/package-list and the script fails. What should I do? Is there are way to fix it? Thanks, Phillipe Ramalho On Wed, Jun 29, 2011 at 2:05 PM, Robert Muir rcm...@gmail.com wrote: On Wed, Jun 29, 2011 at 2:00 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Failure to fetch junit's package list yet again... but Hoss is working : on this I think! I posted a straw-man patch, but i haven't relaly had time to seriously test it on modules/contrib ... and i thik rmuir had some reservations about putting the stuff in dev-tools ... but if someone is itching go ahead and commit. (i'm a little swamped right now) right, if the javadocs target in lucene/build.xml has a hard dependency on dev-tools, then the lucene source release won't work. but we could do some other things to fix this: * make this a soft dependency (e.g. the javadocs task will use dev-tools/plists when they are available, otherwise it downloads) * move dev-tools under lucene/ so we don't worry about this stuff * put the package-lists somewhere other than dev-tools (even if its just on hudson) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Phillipe Ramalho
Re: [JENKINS] Solr-3.x - Build # 394 - Failure
http://download.oracle.com/javase/6/docs/api/package-list apparently works reliably. On Tue, Jul 5, 2011 at 12:08 PM, Phillipe Ramalho phillipe.rama...@gmail.com wrote: Hi, I am having similar problems. When running ant javadocs on contrib/queryparser, I get the following error: javadoc: warning - Error fetching URL: http://java.sun.com/j2se/1.6/docs/api/package-list and the script fails. What should I do? Is there are way to fix it? Thanks, Phillipe Ramalho On Wed, Jun 29, 2011 at 2:05 PM, Robert Muir rcm...@gmail.com wrote: On Wed, Jun 29, 2011 at 2:00 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Failure to fetch junit's package list yet again... but Hoss is working : on this I think! I posted a straw-man patch, but i haven't relaly had time to seriously test it on modules/contrib ... and i thik rmuir had some reservations about putting the stuff in dev-tools ... but if someone is itching go ahead and commit. (i'm a little swamped right now) right, if the javadocs target in lucene/build.xml has a hard dependency on dev-tools, then the lucene source release won't work. but we could do some other things to fix this: * make this a soft dependency (e.g. the javadocs task will use dev-tools/plists when they are available, otherwise it downloads) * move dev-tools under lucene/ so we don't worry about this stuff * put the package-lists somewhere other than dev-tools (even if its just on hudson) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Phillipe Ramalho -- Chris Male | Software Developer | JTeam BV.| www.jteam.nl
[jira] [Commented] (LUCENE-3275) SamplingAccumulatorTest hangs on Java 1.6.0u26
[ https://issues.apache.org/jira/browse/LUCENE-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059645#comment-13059645 ] Robert Muir commented on LUCENE-3275: - hoss, no I did not. this is basically just a warning for other devs not to upgrade to this broken jre on their macs. otherwise your tests hang and you must kill -9 SamplingAccumulatorTest hangs on Java 1.6.0u26 -- Key: LUCENE-3275 URL: https://issues.apache.org/jira/browse/LUCENE-3275 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir on the mac, a system update pushed out an upgrade to 1.6.0u26 basically, if i run 'ant test' from the faceting module, my jre completely hangs (0% cpu, won't even respond to kill -QUIT to print a stacktrace). This is reproducable... it always happens inside SamplingAccumulatorTest. Of course if i run this test by itself, or anything else, it doesn't want to hang... but you should be able to reproduce by running 'ant test -Dtests.threadspercpu=0' which runs all tests sequentially. Acts like http://forums.oracle.com/forums/thread.jspa?threadID=2246699 I think this JRE version (update 26) is broken. If your mac asks you to upgrade, just say no. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser
[ https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phillipe Ramalho updated LUCENE-2979: - Attachment: LUCENE-2979_phillipe_ramalho_3.patch Hi Adriano, Sorry for that, I forgot to add javadoc to those comments, they are pretty important. Anyway, the problem was not really the missing javadoc, but javadoc was not understanding the link to inner classes (Class.InnerClass#Constant), I had to reference the inner class directly (InnerClass#Constant). However, there is a still a javadoc warning, I just sent an email to the mailing list, I hope you saw it already, where I report the problem. Here is the third patch with javadoc fixes. Simplify configuration API of contrib Query Parser -- Key: LUCENE-2979 URL: https://issues.apache.org/jira/browse/LUCENE-2979 Project: Lucene - Java Issue Type: Improvement Components: modules/other Affects Versions: 2.9, 3.0 Reporter: Adriano Crestani Assignee: Adriano Crestani Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor Fix For: 3.4, 4.0 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_reamalho.patch The current configuration API is very complicated and inherit the concept used by Attribute API to store token information in token streams. However, the requirements for both (QP config and token stream) are not the same, so they shouldn't be using the same thing. I propose to simplify QP config and make it less scary for people intending to use contrib QP. The task is not difficult, it will just require a lot of code change and figure out the best way to do it. That's why it's a good candidate for a GSoC project. I would like to hear good proposals about how to make the API more friendly and less scaring :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser
[ https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phillipe Ramalho updated LUCENE-2979: - Attachment: LUCENE-2979_phillipe_ramalho_3.patch oops, I had forgotten to check the ASF license. Simplify configuration API of contrib Query Parser -- Key: LUCENE-2979 URL: https://issues.apache.org/jira/browse/LUCENE-2979 Project: Lucene - Java Issue Type: Improvement Components: modules/other Affects Versions: 2.9, 3.0 Reporter: Adriano Crestani Assignee: Adriano Crestani Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor Fix For: 3.4, 4.0 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_reamalho.patch The current configuration API is very complicated and inherit the concept used by Attribute API to store token information in token streams. However, the requirements for both (QP config and token stream) are not the same, so they shouldn't be using the same thing. I propose to simplify QP config and make it less scary for people intending to use contrib QP. The task is not difficult, it will just require a lot of code change and figure out the best way to do it. That's why it's a good candidate for a GSoC project. I would like to hear good proposals about how to make the API more friendly and less scaring :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2979) Simplify configuration API of contrib Query Parser
[ https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059651#comment-13059651 ] Adriano Crestani commented on LUCENE-2979: -- Hi Phillipe, thanks for the quick fix! Just committed your last patch (LUCENE-2979_phillipe_ramalho_3.patch) on revision 1142862 Simplify configuration API of contrib Query Parser -- Key: LUCENE-2979 URL: https://issues.apache.org/jira/browse/LUCENE-2979 Project: Lucene - Java Issue Type: Improvement Components: modules/other Affects Versions: 2.9, 3.0 Reporter: Adriano Crestani Assignee: Adriano Crestani Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor Fix For: 3.4, 4.0 Attachments: LUCENE-2979_phillipe_ramalho_2.patch, LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_ramalho_3.patch, LUCENE-2979_phillipe_reamalho.patch The current configuration API is very complicated and inherit the concept used by Attribute API to store token information in token streams. However, the requirements for both (QP config and token stream) are not the same, so they shouldn't be using the same thing. I propose to simplify QP config and make it less scary for people intending to use contrib QP. The task is not difficult, it will just require a lot of code change and figure out the best way to do it. That's why it's a good candidate for a GSoC project. I would like to hear good proposals about how to make the API more friendly and less scaring :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-1768) NumericRange support for new query parser
[ https://issues.apache.org/jira/browse/LUCENE-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059653#comment-13059653 ] Adriano Crestani commented on LUCENE-1768: -- Hi, I committed the first patch from LUCENE-2979 and it changed completely the queryparser config API. It seems Vinicius is ahead of schedule with this project, is that corret?! Is there anything else to do after documentation? If not, I would ask whether it's possible for you to change the way you use config API for numeric. I think it will not require a lot of change, the api is much simpler now. I can ask Phillipe to help you and explain how the new api works ;) Otherwise, Uwe will not be able to commit your patch, since there will be many classes missing now. What do you think Uwe? NumericRange support for new query parser - Key: LUCENE-1768 URL: https://issues.apache.org/jira/browse/LUCENE-1768 Project: Lucene - Java Issue Type: New Feature Components: core/queryparser Affects Versions: 2.9 Reporter: Uwe Schindler Assignee: Adriano Crestani Labels: contrib, gsoc, gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: week1.patch, week2.patch, week3.patch, week4.patch, week5-6.patch It would be good to specify some type of schema for the query parser in future, to automatically create NumericRangeQuery for different numeric types? It would then be possible to index a numeric value (double,float,long,int) using NumericField and then the query parser knows, which type of field this is and so it correctly creates a NumericRangeQuery for strings like [1.567..*] or (1.787..19.5]. There is currently no way to extract if a field is numeric from the index, so the user will have to configure the FieldConfig objects in the ConfigHandler. But if this is done, it will not be that difficult to implement the rest. The only difference between the current handling of RangeQuery is then the instantiation of the correct Query type and conversion of the entered numeric values (simple Number.valueOf(...) cast of the user entered numbers). Evenerything else is identical, NumericRangeQuery also supports the MTQ rewrite modes (as it is a MTQ). Another thing is a change in Date semantics. There are some strange flags in the current parser that tells it how to handle dates. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3273) Convert Lucene Core tests over to a simple MockQueryParser
[ https://issues.apache.org/jira/browse/LUCENE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male updated LUCENE-3273: --- Attachment: LUCENE-3273.patch New patch, new ideas. I've moved away from introducing anything new and have converted all the core tests over to instantiating Queries programmatically. No builders / helpers are used. Everything compiles and passes again. Convert Lucene Core tests over to a simple MockQueryParser -- Key: LUCENE-3273 URL: https://issues.apache.org/jira/browse/LUCENE-3273 Project: Lucene - Java Issue Type: Sub-task Components: core/other Reporter: Chris Male Attachments: LUCENE-3273.patch, LUCENE-3273.patch, LUCENE-3273_testboolean2.patch Most tests use Lucene Core's QueryParser for convenience. We want to consolidate it into a QP module which we can't have as a dependency. We should add a simple MockQueryParser which does String.split() on the query string, analyzers the terms and builds a BooleanQuery if necessary. Any more complex Queries (such as phrases) should be done programmatically. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk - Build # 1615 - Still Failing
Build: https://builds.apache.org/job/Lucene-trunk/1615/ No tests ran. Build Log (for compile errors): [...truncated 9395 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9334 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9334/ All tests passed Build Log (for compile errors): [...truncated 17541 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3274) Collapse Common module into Lucene core util
[ https://issues.apache.org/jira/browse/LUCENE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059674#comment-13059674 ] Chris Male commented on LUCENE-3274: I'm going to commit this tomorrow. Collapse Common module into Lucene core util Key: LUCENE-3274 URL: https://issues.apache.org/jira/browse/LUCENE-3274 Project: Lucene - Java Issue Type: Improvement Reporter: Chris Male Attachments: LUCENE-3274.patch It was suggested by Robert in [http://markmail.org/message/wbfuzfamtn2qdvii] that we should try to limit the dependency graph between modules and where there is something 'common' it should probably go into lucene core. Given that I haven't added anything to this module except the MutableValue classes, I'm going to collapse them into the util package, remove the module, and correct the dependencies. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org