[jira] [Created] (LUCENE-3982) regex support in queryparser needs documented, and called out in CHANGES.txt
regex support in queryparser needs documented, and called out in CHANGES.txt Key: LUCENE-3982 URL: https://issues.apache.org/jira/browse/LUCENE-3982 Project: Lucene - Java Issue Type: Sub-task Reporter: Hoss Man Priority: Blocker Fix For: 4.0 Spun off of LUCENE-2604 where everyone agreed this needed done, but no one has done it yet, and rmuir didn't want to leave the issue open... {quote} some issues were pointed out in a recent mailing list thread that definitely seem like they should be addressed before this is officially released... * queryparsersyntax.xml doesn't mention this feature at all -- as major new syntax is should really get it's own section with an example showing the syntax * queryparsersyntax.xml's section on Escaping Special Characters needs to mention that '/' is a special character Also: Given that Yury encountered some real world situations in which the new syntax caused problems with existing queries, it seems like we should definitely make a note about this possibility more promonient ... i'm not sure if it makes sense in MIGRATE.txt but at a minimum it seems like the existing CHANGES.txt entry should mention it, maybe something like... {noformat} * LUCENE-2604: Added RegexpQuery support to QueryParser. Regular expressions are now directly supported by the standard queryparser using the syntax... fieldName:/expression/ OR /expression against default field/ Users who wish to search for literal / characters are advised to backslash-escape or quote those characters as needed. (Simon Willnauer, Robert Muir) {noformat} {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3978) redo how our download redirect pages work
redo how our download redirect pages work - Key: LUCENE-3978 URL: https://issues.apache.org/jira/browse/LUCENE-3978 Project: Lucene - Java Issue Type: Improvement Reporter: Hoss Man Fix For: 4.0 the download latest redirect pages are kind of a pain to change when we release a new version... http://lucene.apache.org/core/mirrors-core-latest-redir.html http://lucene.apache.org/solr/mirrors-solr-latest-redir.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3323) fix solr javadocs to link to local lucene javadocs w/relative links when users build locally
fix solr javadocs to link to local lucene javadocs w/relative links when users build locally Key: SOLR-3323 URL: https://issues.apache.org/jira/browse/SOLR-3323 Project: Solr Issue Type: Task Components: documentation Reporter: Hoss Man Assignee: Steven Rowe Fix For: 3.6, 4.0 Now that solr/lucene are in lock step development, and solr releases include the entire lucene-java release, the solr ant targets for building javadocs should depend on the lucene (and module) targets for building javadocs and link directly to the local copies of those docs (using relative paths) (currently, the links point to https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3945) we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect
we should include checksums for every jar ivy fetches in svn src releases to verify the jars are the ones we expect - Key: LUCENE-3945 URL: https://issues.apache.org/jira/browse/LUCENE-3945 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man Fix For: 3.6, 4.0 Conversation with rmuir last night got me thinking about the fact that one thing we lose by using ivy is confidence that every user of a release is compiling against (and likely using at run time) the same dependencies as every other user. Up to 3.5, users of src and binary releases could be confident that the jars included in the release were the same jars the lucene devs vetted and tested against when voting on the release candidate, but with ivy there is now the possibility that after the source release is published, the owner of a domain where these dependencies are hosted might change the jars in some way w/o anyone knowing. Likewise: we as developers could commit an ivy.xml file pointing to a specific URL which we then use for and test for months, and just prior to a release, the contents of the remote URL could change such that a JAR included in the binary artifacts might not match the ones we've vetted and tested leading up to that RC. So i propose that we include checksum files in svn and in our source releases that can be used by users to verify that the jars they get from ivy match the jars we tested against. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3946) improve docs ivy verification output to explain classpath problems and mention --noconfig
improve docs ivy verification output to explain classpath problems and mention --noconfig - Key: LUCENE-3946 URL: https://issues.apache.org/jira/browse/LUCENE-3946 Project: Lucene - Java Issue Type: Task Reporter: Hoss Man offshoot of LUCENE-3930, where shawn reported... {quote} I can't get either branch_3x or trunk to build now, on a system that used to build branch_3x without complaint. It says that ivy is not available, even after doing ant ivy-bootstrap to download ivy into the home directory. Specifically I am trying to build solrj from trunk, but I can't even get ant in the root directory of the checkout to work. I'm on CentOS 6 with oracle jdk7 built using the city-fan.org SRPMs. Ant (1.7.1) and junit are installed from package repositories. Building a checkout of lucene_solr_3_5 on the same machine works fine. {quote} The root cause is that ant's global configs can be setup to ignore the users personal lib dir. suggested work arround is to run ant --noconfig but we should also try to give the user feedback in our failure about exactly what classpath ant is currently using (because apparently ${java.class.path} is not actually it) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3292) /browse example fails to load on 3x: no field name specified in query and no default specified via 'df' param
/browse example fails to load on 3x: no field name specified in query and no default specified via 'df' param --- Key: SOLR-3292 URL: https://issues.apache.org/jira/browse/SOLR-3292 Project: Solr Issue Type: Bug Reporter: Hoss Man Priority: Blocker Fix For: 3.6 1) java -jar start.jar using solr example on 3x branch circa r1306629 2) load http://localhost:8983/solr/browse 3) browser error: 400 no field name specified in query and no default specified via 'df' param 4) error in logs... {noformat} INFO: [] webapp=/solr path=/browse params={} hits=0 status=400 QTime=3 Mar 28, 2012 4:05:59 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: no field name specified in query and no default specified via 'df' param at org.apache.solr.search.SolrQueryParser.checkNullField(SolrQueryParser.java:158) at org.apache.solr.search.SolrQueryParser.getFieldQuery(SolrQueryParser.java:174) at org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1429) at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1317) at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1245) at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1234) at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:206) at org.apache.solr.search.LuceneQParser.parse(LuceneQParserPlugin.java:79) at org.apache.solr.search.QParser.getQuery(QParser.java:143) at org.apache.solr.request.SimpleFacets.getFacetQueryCounts(SimpleFacets.java:233) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:194) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:72) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:186) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3287) 3x tutorial tries to demo schema features that don't work with 3x schema
3x tutorial tries to demo schema features that don't work with 3x schema Key: SOLR-3287 URL: https://issues.apache.org/jira/browse/SOLR-3287 Project: Solr Issue Type: Bug Reporter: Hoss Man Priority: Blocker Fix For: 3.6 I just audited the tutorial on the 3x branch to ensure everything would work for the 3.6 release, and ran into a two sections where things were very confusing and seemed broken to me (even as a solr expert) https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/solr/core/src/java/doc-files/tutorial.html 1) Text Analysis of the 5 queries in this section, only the pixima example works (power-shot matches documents but not the ones the tutorial suggests it should, and for different reasons). The lead in para does explain that you have to edit your schema.xml in order for these links to work -- but it's confusing, and i honestly read it 3 times before i realized what it was saying (the first two times i thought it was saying that _because_ the content is in english, english specific field types are used, and you can change those to text_general if you don't use english) Bottom line: the links are confusing since they don't work out of the box with the simple commands shown so far {panel} If you know your textual content is English, as is the case for the example documents in this tutorial, and you'd like to apply English-specific stemming and stop word removal, as well as split compound words, you can use the text_en_splitting fieldType instead. Go ahead and edit the schema.xml under the solr/example/solr/conf directory, and change the type for fields text and features from text_general to text_en_splitting. Restart the server and then re-post all of the documents, and then these queries will show the English-specific transformations: * A search for power-shot matches PowerShot, and adata matches A-DATA due to the use of WordDelimiterFilter and LowerCaseFilter. * A search for features:recharging matches Rechargeable due to stemming with the EnglishPorterFilter. * A search for 1 gigabyte matches things with GB, and the misspelled pixima matches Pixma due to use of a SynonymFilter. {panel} * http://localhost:8983/solr/select/?indent=onq=power-shotfl=name * http://localhost:8983/solr/select/?indent=onq=adatafl=name * http://localhost:8983/solr/select/?indent=onq=features:rechargingfl=name,features * http://localhost:8983/solr/select/?indent=onq=%221%20gigabyte%22fl=name * http://localhost:8983/solr/select/?indent=onq=piximafl=name 2) Analysis Debugging Likewise, all of the analysis.jsp example URLs attempt to show off how various features work, but the fields used don't demonstrate the analysis being discussed unless the user has edited the schema as discussed in the previous section {panel} This shows how Canon Power-Shot SD500 would be indexed as a value in the name field. Each row of the table shows the resulting tokens after having passed through the next TokenFilter in the analyzer for the name field. Notice how both powershot and power, shot are indexed. Tokens generated at the same position are shown in the same column, in this case shot and powershot. Selecting verbose output will show more details, such as the name of each analyzer component in the chain, token positions, and the start and end positions of the token in the original text. Selecting highlight matches when both index and query values are provided will take the resulting terms from the query value and highlight all matches in the index value analysis. Here is an example of stemming and stop-words at work. {panel} * http://localhost:8983/solr/admin/analysis.jsp?name=nameval=Canon+Power-Shot+SD500 * http://localhost:8983/solr/admin/analysis.jsp?name=nameverbose=onval=Canon+Power-Shot+SD500 * http://localhost:8983/solr/admin/analysis.jsp?name=namehighlight=onval=Canon+Power-Shot+SD500qval=Powershot%20sd-500 * http://localhost:8983/solr/admin/analysis.jsp?name=texthighlight=onval=Four+score+and+seven+years+ago+our+fathers+brought+forth+on+this+continent+a+new+nation%2C+conceived+in+liberty+and+dedicated+to+the+proposition+that+all+men+are+created+equal.+qval=liberties+and+equality -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3288) audit tutorial before 4.0 release
audit tutorial before 4.0 release - Key: SOLR-3288 URL: https://issues.apache.org/jira/browse/SOLR-3288 Project: Solr Issue Type: Task Reporter: Hoss Man Assignee: Hoss Man Fix For: 4.0 Prior to the 4.0 release, audit the tutorial and verify... * command line output looks reasonable * analysis examples/discussion matches field types used * links to admin UI are correct for new UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3266) Audit error messages if file permisions prevent files/directories from being read/listed
Audit error messages if file permisions prevent files/directories from being read/listed Key: SOLR-3266 URL: https://issues.apache.org/jira/browse/SOLR-3266 Project: Solr Issue Type: Improvement Reporter: Hoss Man had a question from sqwk on the #solr irc channel last night where he had some questions about weird logs errors indicating that it wasn't using his solr.xml. Part of the confusion was SOLR-3264, but i couldn't make sense of the rest. In talking with miller on IRC today, it occurred to me that file permission problems preventing solr from reading the solr.xml file could explain everything -- because unlike trunk, Solr 3.5 didn't log anything special if it couldn't find solr.xml and used the legacy singlecore mode as a fallback (an oversight i've already fixed in [r1304126|http://svn.apache.org/viewvc?rev=1304126view=rev]) For many files Solr tries to load, we can't fail fast if the file isn't found, or isn't readable, because we support reading from the classpath (and zookeeper) as alternatives, but it would be nice to see if we can come up with a standard way to give good warning/error messages if: * a file exists, but isn't readable (error?) * a directory where we are looking for a file exists but isn't readable or executable (warning?) ...i suspect the hardest part of this will be having good test cases -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3264) SolrResourceLoader logging about Solr home set to is very missleading/broken in multicore
SolrResourceLoader logging about Solr home set to is very missleading/broken in multicore --- Key: SOLR-3264 URL: https://issues.apache.org/jira/browse/SOLR-3264 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man the SolrResourceLoader constructor has this bit of logging left over from the days before multicore... {noformat} log.info(Solr home set to ' + this.instanceDir + '); {noformat} but this is confusing and missleading since there are N+1 SOlrResourceLoaders in a given solr instance (1 for the CoreContainer, and N for the N cores) and only one of them is refering to the *true* Solr Home dir, the others are refering to the *instanceDir* of the respective cores. For example, using the 3.5 example and running {{java -Dsolr.solr.home=multicore -jar start.jar}} you'll see... {noformat} Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: JNDI not configured for solr (NoInitialContextEx) Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: using system property solr.solr.home: multicore Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'multicore/' Mar 21, 2012 7:02:46 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: SolrDispatchFilter.init() Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: JNDI not configured for solr (NoInitialContextEx) Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: using system property solr.solr.home: multicore Mar 21, 2012 7:02:46 PM org.apache.solr.core.CoreContainer$Initializer initialize INFO: looking for solr.xml: /home/hossman/lucene/lucene-3.5.0_tag/solr/example/multicore/solr.xml Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: JNDI not configured for solr (NoInitialContextEx) Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: using system property solr.solr.home: multicore Mar 21, 2012 7:02:46 PM org.apache.solr.core.CoreContainer init INFO: New CoreContainer: solrHome=multicore/ instance=108681753 Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'multicore/' Mar 21, 2012 7:02:46 PM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'multicore/core0/' ...lots of logs about initing core0... INFO: registering core: core0 Mar 21, 2012 7:02:47 PM org.apache.solr.core.SolrCore registerSearcher INFO: [core0] Registered new searcher Searcher@5dde45e2 main Mar 21, 2012 7:02:47 PM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'multicore/core1/' Mar 21, 2012 7:02:47 PM org.apache.solr.core.SolrConfig init ...lots of logs about initing core1... Mar 21, 2012 7:02:47 PM org.apache.solr.core.CoreContainer register INFO: registering core: core1 ... {noformat} we should revamp/add some of the log messages from CoreContainer and SolrResourceLoader to make it more clear what the one true solr home is, and when SolrresourceLoader is being used for an instanceDir of a single core. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3261) edismax ignores explicit operators when literal colon is found
edismax ignores explicit operators when literal colon is found -- Key: SOLR-3261 URL: https://issues.apache.org/jira/browse/SOLR-3261 Project: Solr Issue Type: Bug Reporter: Hoss Man Using the 3.5 example this query... q = bogus:xxx AND text_t:yak http://localhost:8983/solr/select/?debugQuery=trueqf=a_t+b_tdefType=edismaxmm=0q=bogus:xxx+AND+text_t:yak parses as... {noformat} +(DisjunctionMaxQuery((a_t:bogus:xxx | b_t:bogus:xxx)) DisjunctionMaxQuery((a_t:and | b_t:and)) text_t:yak) {noformat} (Note that AND is considered a term and is searched for in the qf fields) But this query... q = foo_s:xxx AND text_t:yak http://localhost:8983/solr/select/?debugQuery=trueqf=a_t+b_tdefType=edismaxmm=0q=foo_s:xxx+AND+text_t:yak parses correctly treating AND as an explicit operator... {noformat} +(+foo_s:xxx +text_t:yak) {noformat} (this problem also seems to affect trunk circa 2012-03-20) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3217) refactor range faceting code so that the list of FieldTypes supported isn't hardcoded
refactor range faceting code so that the list of FieldTypes supported isn't hardcoded - Key: SOLR-3217 URL: https://issues.apache.org/jira/browse/SOLR-3217 Project: Solr Issue Type: Improvement Reporter: Hoss Man idea that occured to me reviewing SOLR-2202, haven't thought it through all the way to be certain it would work... 1) create a new marker interface RangeFacetable which contains a single method {{getRangeEndpointCalculator(SchemaField)}} 2) refactor SimpleFacets so that instead of the big {{if (ft instanceof ...) { ... } else if }} block there right now, we just check if the FieldType is an instance of RangeFacetable 3) use ft.getRangeEndpointCalculator to do the voodoo we curently doodoo 4) make all of the existing {{private static}} subclasses of RangeEndpointCalculator (like IntegerRangeEndpointCalculator) public top level classes so custom FieldTypes can use them -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3220) RecoveryZkTest test failure
RecoveryZkTest test failure --- Key: SOLR-3220 URL: https://issues.apache.org/jira/browse/SOLR-3220 Project: Solr Issue Type: Bug Reporter: Hoss Man observed a failure in RecoveryZkTest.testDistribSearch using r1298661 that had some odd looking (to me) log info. could not reproduce with identical seed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3210) 3.6 POST RELEASE TASK: update site tutorial.html to link to versioned tutorial
3.6 POST RELEASE TASK: update site tutorial.html to link to versioned tutorial -- Key: SOLR-3210 URL: https://issues.apache.org/jira/browse/SOLR-3210 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Fix For: 3.6 Unless we have an alternate strategy in place for dealing with versioned docs by the time 3.6 is released, then as a post-release task, once the 3.6 javadocs are snapshoted online (ie: http://lucene.apache.org/solr/api/) the current online copy of the tutorial (http://lucene.apache.org/solr/tutorial.html) should be pruned down so that it is just a link to the snapshot version released with 3.6 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3183) remove legacy forrest site from solr release
remove legacy forrest site from solr release Key: SOLR-3183 URL: https://issues.apache.org/jira/browse/SOLR-3183 Project: Solr Issue Type: Improvement Reporter: Hoss Man Priority: Blocker Fix For: 3.6, 4.0 A broader discussion is taking place on the dev list about how we want to more forward with dealing with core/solr documentation in a post-forrest world, but a more immediate concern is while the forrest based docs on the lucene side of things are already version specific and could be released _now_, the solr forrest docs are not -- if we attempted to release 3.6 today the solr docs dir would contain a complete copy of the _old_ forrest generated website that would make no sense to users. we could just flat out remove the solr forrest docs, but that doesn't really address the issue of the tutorial. A copy of the tutorial currently exists on the CMS powered website, but since it's not versioned it won't really help with the 3.6 vs 4x situation that will arise in the very near future. My suggestion for the short term is that we do the following on both branches: * svn mv solr/site/tutorial.html solr/core/src/java/doc-files and then clean it up to remove all the forrest navigation * update solr/core/src/java/overview.html to mention and link to the tutorial * eliminate solr/site and solr/site-src from trunk and branch-3x and adjust the build.xml as needed ...that would get us into the state where we could release 3.6 at will w/o a weird documentation cluster fuck, and the contents of https://lucene.apache.org/solr/tutorial.html could be updated to become a meta-info page about the tutorial with a link to http://lucene.apache.org/solr/api/tutorial.html (where we'd have already updated the javadocs to represent 3.6). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3175) simplify add test to ensure various query escape functions are in sync
simplify add test to ensure various query escape functions are in sync -- Key: SOLR-3175 URL: https://issues.apache.org/jira/browse/SOLR-3175 Project: Solr Issue Type: Improvement Reporter: Hoss Man We have three query syntax escape related functions (that i know) of that can't be refactored... * QueryParser.escape ** canonical * ClientUtils.escapeQueryChars ** part of solrj, doesn't depend directly on QueryParser so that Solr clients on't need the query parser jar locally * SolrPluginUtils.partialEscape ** designed to be a negative subset of the full set (ie: all chars except +/-/) ...we should figure out a way to assert in our tests that these are all in agreement (or at least as much as they are ment to be) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3095) update processor chain should check for enable attribute on all processors
update processor chain should check for enable attribute on all processors Key: SOLR-3095 URL: https://issues.apache.org/jira/browse/SOLR-3095 Project: Solr Issue Type: Improvement Reporter: Hoss Man many types of plugins in Solr allow you to specify an enabled boolean when configuring them, so you can use system properties in the configuration file to determine at run time if they are actually used -- we should add low level support for this type of setting on the individual processor declarations in the UpdateRequestProcessorChain as well, so individual update processor factories don't have to deal with this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3024) JSONTestUtil.matchObj is not respecting the delta
JSONTestUtil.matchObj is not respecting the delta - Key: SOLR-3024 URL: https://issues.apache.org/jira/browse/SOLR-3024 Project: Solr Issue Type: Bug Affects Versions: 3.2 Reporter: David Smiley Fix For: 3.6, 4.0 As noted in SOLR-2451, (comment - 22/Dec/11 21:33) the matchObj changes made in that issue were incomplete, and the delta is not being respected. patch attached to SOLR-2451 opening a new issue to record chages as a new bug fix. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-3008) edismax pf clause makes no sense when query already has quoted subphrase
edismax pf clause makes no sense when query already has quoted subphrase Key: SOLR-3008 URL: https://issues.apache.org/jira/browse/SOLR-3008 Project: Solr Issue Type: Bug Reporter: Hoss Man As noted by ldavid2020 on the solr-user mailing list (Tue, 20 Dec 2011) the behavior of edismax when the pf param is used and the query string contains quotes arround part of the query makes no sense at all... {quote} For the same query: 2012 japan airlines flight status dismax... [http://localhost:8983/solr/select?q=2012+japan+airlines+flight+statusqf=TTLpf=TTLdebugQuery=truedefType=dismax] outputs: {noformat} +((DisjunctionMaxQuery((TTL:2012)~0.1) DisjunctionMaxQuery((TTL:japan airlin~3)~0.1) DisjunctionMaxQuery((TTL:flight)~0.1) DisjunctionMaxQuery((TTL:status)~0.1) )~3) DisjunctionMaxQuery((TTL:2012 japan airlin flight status~3)~0.1) {noformat} The parsedquery has DisjunctionMaxQuery((TTL:2012 japan airlin flight status~3)~0.1). While edismax... [http://localhost:8983/solr/select?q=2012+japan+airlines+flight+statusqf=TTLpf=TTLdebugQuery=truedefType=edismax] outputs: {noformat} +((DisjunctionMaxQuery((TTL:2012)~0.1) DisjunctionMaxQuery((TTL:japan airlin~3)~0.1) DisjunctionMaxQuery((TTL:flight)~0.1) DisjunctionMaxQuery((TTL:status)~0.1) )~3) DisjunctionMaxQuery((TTL:2012 flight status~3)~0.1) {noformat} The parsedquery has DisjunctionMaxQuery((TTL:2012 flight status~3)~0.1). ... So it seems edismax ignores japan airlines for the pf matching. This could cause some issues, in that a document with exactly phrase 2012 japan airlines flight status will have the same relevancy score with another one with two phrases japan airlines, 2012 flight status far away. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2996) make q=* not suck in the lucene and edismax parsers
make q=* not suck in the lucene and edismax parsers - Key: SOLR-2996 URL: https://issues.apache.org/jira/browse/SOLR-2996 Project: Solr Issue Type: Improvement Reporter: Hoss Man More then a few users have gotten burned by thinking that * is the appropriate syntax for match all docs when what it really does (unless i'm mistaken) is create a prefix query on the default search field using a blank string as the prefix. since it seems very unlikely that anyone has a genuine usecase for making a prefix query with a blank prefix, we should change the default behavior of the LuceneQParser and EDismaxQParsers (and any other Qparsers that respect *:* if i'm forgetting them) to treat this situation the same as *:*. we can offer a (local)param to force the old behavior if someone really wants it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2988) edismax does not respect pf params using non-tokenized fields
edismax does not respect pf params using non-tokenized fields - Key: SOLR-2988 URL: https://issues.apache.org/jira/browse/SOLR-2988 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Hoss Man for reasons i don't fully understand, edismax ignores fields in the pf param if those fields are non-tokenized. Consider this example *dismax* query in Solr 3.5... {noformat} http://localhost:8983/solr/select/?debugQuery=truedefType=dismaxqf=name^5+features^3pf=features^2+cat^4q=hard+drive str name=parsedquery +((DisjunctionMaxQuery((features:hard^3.0 | name:hard^5.0)) DisjunctionMaxQuery((features:drive^3.0 | name:drive^5.0)) )~2) DisjunctionMaxQuery((features:hard drive^2.0 | cat:hard drive^4.0)) {noformat} ...compared to the equivalent *edismax* query... {noformat} http://localhost:8983/solr/select/?debugQuery=truedefType=edismaxqf=name^5+features^3pf=features^2+cat^4q=hard+drive str name=parsedquery +((DisjunctionMaxQuery((features:hard^3.0 | name:hard^5.0)) DisjunctionMaxQuery((features:drive^3.0 | name:drive^5.0)) )~2) DisjunctionMaxQuery((features:hard drive^2.0)) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2981) multiple stats.facet params duplicate stats faceting output
multiple stats.facet params duplicate stats faceting output --- Key: SOLR-2981 URL: https://issues.apache.org/jira/browse/SOLR-2981 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Hoss Man * Load the example data * Load a URL that uses stats.facet on multiple fields, ie: http://localhost:8983/solr/select?q=*:*stats=truestats.field=pricestats.field=popularitystats.twopass=truerows=00indent=truestats.facet=inStockstats.facet=manu_id_s Response will include two identical facets lists for each stats field (ie both facets blocks will contain the faceted stats for both of the stats.facet fields) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2976) TrieField.isTokenized returns true regardless of precisionStep
TrieField.isTokenized returns true regardless of precisionStep -- Key: SOLR-2976 URL: https://issues.apache.org/jira/browse/SOLR-2976 Project: Solr Issue Type: Bug Affects Versions: 3.5 Reporter: Hoss Man regardless of the precisionStep used, TrieField.isTokenized is hardcoded to return true -- so even if a user has something like this in their schema... {code} fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true / field name=ts type=long indexed=true stored=true required=true multiValued=false / {code} ...any code paths that are driven by isTokenized will think their may be multiple terms per document when in reality there is at most one. we should consider redefining TrieField.isTokenized to be something like... {code} @Override public boolean isTokenized() { return Integer.MAX_VALUE != precisionStep; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2935) Better docs for numeric FieldTypes
Better docs for numeric FieldTypes -- Key: SOLR-2935 URL: https://issues.apache.org/jira/browse/SOLR-2935 Project: Solr Issue Type: Improvement Components: documentation Reporter: Hoss Man Assignee: Hoss Man It was recently pointed out to me that if you don't come from a java background, understanding the range of legal values for TrieIntField vs TrieLongField may not be obvious to you (particularly if you are use to dealing with databases that have INT, SMALLINT, TINYINT, etc... with UNSIGNED vs SIGNED modifiers). That subsequently made me realize that to this day the javadocs for the various FieldTypes don't explain the diff between the TrieFoo, SortableFoo, and Foo field types. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3610) Revamp spatial APIs that use primitives (or arrays of primitives) in their args/results so that they use strongly typed objects
Revamp spatial APIs that use primitives (or arrays of primitives) in their args/results so that they use strongly typed objects --- Key: LUCENE-3610 URL: https://issues.apache.org/jira/browse/LUCENE-3610 Project: Lucene - Java Issue Type: Improvement Components: modules/spatial Reporter: Hoss Man Fix For: 4.0 My spatial awareness is pretty meek, but LUCENE-3599 seems like a prime example of the types of mistakes that are probably really easy to make with all of the Spatial related APIs that deal with arrays (or sequences) of doubles where specific indexes of those arrays (or sequences) have significant meaning: mainly latitude vs longitude. We should probably reconsider any method that takes in double[] or multiple doubles to express latlon pairs and rewrite them to use the existing LatLng class -- or if people think that class is too heavyweight, then add a new lightweight class to handle the strong typing of a basic latlon point instead of just passing around a double[2] or two doubles called x and y ... {code} public static final class SimpleLatLonPointInRadians { public double latitude; public double longitude; } {code} ...then all those various methods that expect lat+lon pairs in radians (like DistanceUtils.haversine, DistanceUtils.normLat, DistanceUtils.normLng, DistanceUtils.pointOnBearing, DistanceUtils.latLonCorner, etc...) can start having APIs that don't make your eyes bleed when you start trying to understand what order the args go in. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3587) Attempting to link to Java SE JavaDocs is competely unreliable
Attempting to link to Java SE JavaDocs is competely unreliable -- Key: LUCENE-3587 URL: https://issues.apache.org/jira/browse/LUCENE-3587 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Fix For: 3.6, 4.0 As noted several times since Oracle bought Sun, the canonical links to the Java SE JavaDocs have been unreliable and frequently cause warnings. Since we choose to fail the build on javadoc warnings, this is a serious problem for anyone trying to build from source if/when the package-list we reference in our common-build.xml is not available. We should eliminate this dependency. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2900) TestCoreContainer and CoreAdminHandlerTest try to use solr/core/src/test-files/solr/data if it exists
TestCoreContainer and CoreAdminHandlerTest try to use solr/core/src/test-files/solr/data if it exists - Key: SOLR-2900 URL: https://issues.apache.org/jira/browse/SOLR-2900 Project: Solr Issue Type: Bug Reporter: Hoss Man When the index format has changed on trunk, it has been neccessary to blow away the directory solr/core/src/test-files/solr/data because it contains indexes which will be in a format the trunk doesn't understand. not blowing away this directory will cause TestCoreContainer and CoreAdminHandlerTest to fail. However after deleting this directory, and running those two tests, the directory will not be recreated -- but if you run *all* tests, then the directory is recreated. This seems to suggest: * Some (unknown) test is creating this directory and the index in it * The two test classes mentioned are using this directory if it exists, but do not require it to function -- meaning there is non-deterministic logic in where/how they get their index from. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2802) Toolkit of UpdateProcessors for modifying document values
Toolkit of UpdateProcessors for modifying document values - Key: SOLR-2802 URL: https://issues.apache.org/jira/browse/SOLR-2802 Project: Solr Issue Type: New Feature Reporter: Hoss Man Frequently users ask about questions about things where the answer is you could do it with an UpdateProcessor but the number of our of hte box UpdateProcessors is generally lacking and there aren't even very good base classes for the common case of manipulating field values when adding documents -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2798) Local Param parsing does not support multivalued params
Local Param parsing does not support multivalued params --- Key: SOLR-2798 URL: https://issues.apache.org/jira/browse/SOLR-2798 Project: Solr Issue Type: Bug Reporter: Hoss Man As noted by Demian on the solr-user mailing list, Local Param parsing seems to use a last one wins approach when parsing multivalued params. In this example, the value of 111 is completely ignored... http://localhost:8983/solr/select?debug=queryq={!dismax%20bq=111%20bq=222}foo -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2796) AddUpdateCommand.getIndexedId doesn't work with schema configured defaults - UUIDField can not be used as uniqueKey field
AddUpdateCommand.getIndexedId doesn't work with schema configured defaults - UUIDField can not be used as uniqueKey field - Key: SOLR-2796 URL: https://issues.apache.org/jira/browse/SOLR-2796 Project: Solr Issue Type: Bug Components: update Affects Versions: 4.0 Reporter: Hoss Man Fix For: 4.0 in Solr 1.4, and the HEAD of the 3x branch, the UUIDField can be used as the uniqueKey field even if documents do not specify a value by taking advantage of the {{default=NEW}} feature of UUIDField. but something has changed in trunk to break this behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org