[jira] [Commented] (LUCENE-929) contrib/benchmark build doesn't handle checking if content is properly extracted
[ https://issues.apache.org/jira/browse/LUCENE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038534#comment-13038534 ] Grant Ingersoll commented on LUCENE-929: Doron, that's fine to open a new issue and close this one, but it was this issue's fix that introduced the bug. > contrib/benchmark build doesn't handle checking if content is properly > extracted > > > Key: LUCENE-929 > URL: https://issues.apache.org/jira/browse/LUCENE-929 > Project: Lucene - Java > Issue Type: Bug > Components: modules/benchmark >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.1, 4.0 > > > The contrib/benchmark build does not properly handle checking to see if the > content (such as Reuters coll.) is properly extracted. It only checks to see > if the directory exists. Thus, it is possible that the directory gets > created and the extraction fails. Then, the next time it is run, it skips > the extraction part and tries to continue on running the benchmark. > The workaround is to manually delete the extraction directory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (LUCENE-929) contrib/benchmark build doesn't handle checking if content is properly extracted
[ https://issues.apache.org/jira/browse/LUCENE-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reopened LUCENE-929: Assignee: Grant Ingersoll Lucene Fields: (was: [New]) Note, this fix this doesn't work if the output dir has a trailing slash. See MAHOUT-694. > contrib/benchmark build doesn't handle checking if content is properly > extracted > > > Key: LUCENE-929 > URL: https://issues.apache.org/jira/browse/LUCENE-929 > Project: Lucene - Java > Issue Type: Bug > Components: modules/benchmark >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.1, 4.0 > > > The contrib/benchmark build does not properly handle checking to see if the > content (such as Reuters coll.) is properly extracted. It only checks to see > if the directory exists. Thus, it is possible that the directory gets > created and the extraction fails. Then, the next time it is run, it skips > the extraction part and tries to continue on running the benchmark. > The workaround is to manually delete the extraction directory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036760#comment-13036760 ] Grant Ingersoll commented on SOLR-2155: --- I don't think we are abandoning it, I think David and Ryan, etc. have decided to bake things in the playground for a bit and then may move it back. Lance, the code is in Google Code, search the archives. > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2371) Add a min() function query, upgrade max() function query to take two value sources
[ https://issues.apache.org/jira/browse/SOLR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036755#comment-13036755 ] Grant Ingersoll commented on SOLR-2371: --- I would open a new issue. A patch would be great. Should be pretty straightforward given the new string support. > Add a min() function query, upgrade max() function query to take two value > sources > -- > > Key: SOLR-2371 > URL: https://issues.apache.org/jira/browse/SOLR-2371 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll >Priority: Trivial > Fix For: 4.0 > > Attachments: SOLR-2371.patch > > > There doesn't appear to be a min() function. Also, max() only allows a value > source and a constant b/c it is from before we had more flexible parsing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3104) Hook up Automated Patch Checking for Lucene/Solr
[ https://issues.apache.org/jira/browse/LUCENE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036441#comment-13036441 ] Grant Ingersoll commented on LUCENE-3104: - Here's an example of running just the test-patch.sh: {quote} ./test-patch.sh DEV ../../../patches/LUCENE-3120.patch /path/trunk-clean/build/patches /usr/bin/svn /usr/bin/grep /usr/bin/patch /path/lucene/dev/trunk-clean {quote} Note, the directory where you are applying the patch (trunk-clean) must be clear of all mods. > Hook up Automated Patch Checking for Lucene/Solr > > > Key: LUCENE-3104 > URL: https://issues.apache.org/jira/browse/LUCENE-3104 > Project: Lucene - Java > Issue Type: Task >Reporter: Grant Ingersoll > Attachments: LUCENE-3104.patch > > > It would be really great if we could get feedback to contributors sooner on > many things that are basic (tests exist, patch applies cleanly, etc.) > From Nigel Daley on builds@a.o > {quote} > I revamped the precommit testing in the fall so that it doesn't use Jira > email anymore to trigger a build. The process is controlled by > https://builds.apache.org/hudson/job/PreCommit-Admin/ > which has some documentation up at the top of the job. You can look at the > config of the job (do you have access?) to see what it's doing. Any project > could use this same admin job -- you just need to ask me to add the project > to the Jira filter used by the admin job > (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100 > ) once you have the downstream job(s) setup for your specific project. For > Hadoop we have 3 downstream builds configured which also have some > documentation: > https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/ > https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/ > https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/ > {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2371) Add a min() function query, upgrade max() function query to take two value sources
[ https://issues.apache.org/jira/browse/SOLR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-2371: -- Fix Version/s: (was: 3.2) > Add a min() function query, upgrade max() function query to take two value > sources > -- > > Key: SOLR-2371 > URL: https://issues.apache.org/jira/browse/SOLR-2371 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll >Priority: Trivial > Fix For: 4.0 > > Attachments: SOLR-2371.patch > > > There doesn't appear to be a min() function. Also, max() only allows a value > source and a constant b/c it is from before we had more flexible parsing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2371) Add a min() function query, upgrade max() function query to take two value sources
[ https://issues.apache.org/jira/browse/SOLR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved SOLR-2371. --- Resolution: Fixed > Add a min() function query, upgrade max() function query to take two value > sources > -- > > Key: SOLR-2371 > URL: https://issues.apache.org/jira/browse/SOLR-2371 > Project: Solr > Issue Type: New Feature >Reporter: Grant Ingersoll >Priority: Trivial > Fix For: 3.2, 4.0 > > Attachments: SOLR-2371.patch > > > There doesn't appear to be a min() function. Also, max() only allows a value > source and a constant b/c it is from before we had more flexible parsing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-1143) Return partial results when a connection to a shard is refused
[ https://issues.apache.org/jira/browse/SOLR-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned SOLR-1143: - Assignee: (was: Grant Ingersoll) > Return partial results when a connection to a shard is refused > -- > > Key: SOLR-1143 > URL: https://issues.apache.org/jira/browse/SOLR-1143 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Nicolas Dessaigne > Fix For: 3.2 > > Attachments: SOLR-1143-2.patch, SOLR-1143-3.patch, SOLR-1143.patch > > > If any shard is down in a distributed search, a ConnectException it thrown. > Here's a little patch that change this behaviour: if we can't connect to a > shard (ConnectException), we get partial results from the active shards. As > for TimeOut parameter (https://issues.apache.org/jira/browse/SOLR-502), we > set the parameter "partialResults" at true. > This patch also adresses a problem expressed in the mailing list about a year > ago > (http://www.nabble.com/partialResults,-distributed-search---SOLR-502-td19002610.html) > We have a use case that needs this behaviour and we would like to know your > thougths about such a behaviour? Should it be the default behaviour for > distributed search? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3118) Tools for making explanations easier to consume/understand
Tools for making explanations easier to consume/understand -- Key: LUCENE-3118 URL: https://issues.apache.org/jira/browse/LUCENE-3118 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Often times, reading Explanations (i.e. the breakdown of scores for a particular query and result, say via Solr's &debugQuery) is a pretty cryptic and hard to do undertaking. I often say people suffer from "explain blindness" from staring at explanation results for too long. We could add a layer of explanation helpers above the core Explain functionality that help people understand better what is going on. The goal is to give a higher level of tools to people who aren't necessarily well versed in all the underpinnings of Lucene's scoring mechanisms but still want information about why something didn't match For instance (brainstorming some things that might be doable): * Explain Diff Tool -- Given an 1 or more explanations, quickly highlight what the key things are that differentiate the results (i.e. fieldNorm is higher, etc.) * Given a query and any document, give a more friendly reason why it ranks lower than others without the need to have to parse through all the pieces of the score, for instance, could you simply say something like, programatically that is, this document scored lower compared to your top 10 b/c it had no values in the foo Field. * Could even maybe return codes for these reasons which could then be hooked into actual user messages. I don't have anything concrete patch-wise here, but am putting this up as a way to capture the idea and potentially spur others to think about it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3104) Hook up Automated Patch Checking for Lucene/Solr
[ https://issues.apache.org/jira/browse/LUCENE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-3104: Attachment: LUCENE-3104.patch Totally non-working patch, likely located in the wrong directories, but putting it up here so people can start to get a feel for how this works. The test-patch script can be run by hand and also via Jenkins. > Hook up Automated Patch Checking for Lucene/Solr > > > Key: LUCENE-3104 > URL: https://issues.apache.org/jira/browse/LUCENE-3104 > Project: Lucene - Java > Issue Type: Task >Reporter: Grant Ingersoll > Attachments: LUCENE-3104.patch > > > It would be really great if we could get feedback to contributors sooner on > many things that are basic (tests exist, patch applies cleanly, etc.) > From Nigel Daley on builds@a.o > {quote} > I revamped the precommit testing in the fall so that it doesn't use Jira > email anymore to trigger a build. The process is controlled by > https://builds.apache.org/hudson/job/PreCommit-Admin/ > which has some documentation up at the top of the job. You can look at the > config of the job (do you have access?) to see what it's doing. Any project > could use this same admin job -- you just need to ask me to add the project > to the Jira filter used by the admin job > (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100 > ) once you have the downstream job(s) setup for your specific project. For > Hadoop we have 3 downstream builds configured which also have some > documentation: > https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/ > https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/ > https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/ > {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2193) Re-architect Update Handler
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035075#comment-13035075 ] Grant Ingersoll commented on SOLR-2193: --- Crazy idea: drop the notion of commits all together (or make it an expert thing for the hard core). Default it to 1 second. I wonder how all of this plays with warming/caching, etc. Do you even need those things in this type of setup? > Re-architect Update Handler > --- > > Key: SOLR-2193 > URL: https://issues.apache.org/jira/browse/SOLR-2193 > Project: Solr > Issue Type: Improvement >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 4.0 > > Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, > SOLR-2193.patch > > > The update handler needs an overhaul. > A few goals I think we might want to look at: > 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like > UpdateHandler, DefaultUpdateHandler > 2. Expose the SolrIndexWriter in the api or add the proper abstractions to > get done what we now do with special casing: > if (directupdatehandler2) > success > else > failish > 3. Stop closing the IndexWriter and start using commit (still lazy IW init > though). > 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. > 5. Keep NRT support in mind. > 6. Keep microsharding in mind (maintain logical index as multiple physical > indexes) > 7. Address the current issues we face because multiple original/'reloaded' > cores can have a different IndexWriter on the same index. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3104) Hook up Automated Patch Checking for Lucene/Solr
[ https://issues.apache.org/jira/browse/LUCENE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035028#comment-13035028 ] Grant Ingersoll commented on LUCENE-3104: - Allen, please ask your questions on solr-u...@lucene.apache.org and please delete the comment, as this issue is not the appropriate place for this question. Thanks, Grant > Hook up Automated Patch Checking for Lucene/Solr > > > Key: LUCENE-3104 > URL: https://issues.apache.org/jira/browse/LUCENE-3104 > Project: Lucene - Java > Issue Type: Task >Reporter: Grant Ingersoll > > It would be really great if we could get feedback to contributors sooner on > many things that are basic (tests exist, patch applies cleanly, etc.) > From Nigel Daley on builds@a.o > {quote} > I revamped the precommit testing in the fall so that it doesn't use Jira > email anymore to trigger a build. The process is controlled by > https://builds.apache.org/hudson/job/PreCommit-Admin/ > which has some documentation up at the top of the job. You can look at the > config of the job (do you have access?) to see what it's doing. Any project > could use this same admin job -- you just need to ask me to add the project > to the Jira filter used by the admin job > (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100 > ) once you have the downstream job(s) setup for your specific project. For > Hadoop we have 3 downstream builds configured which also have some > documentation: > https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/ > https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/ > https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/ > {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3104) Hook up Automated Patch Checking for Lucene/Solr
[ https://issues.apache.org/jira/browse/LUCENE-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034964#comment-13034964 ] Grant Ingersoll commented on LUCENE-3104: - General Docs started at http://wiki.apache.org/general/PreCommitBuilds > Hook up Automated Patch Checking for Lucene/Solr > > > Key: LUCENE-3104 > URL: https://issues.apache.org/jira/browse/LUCENE-3104 > Project: Lucene - Java > Issue Type: Task >Reporter: Grant Ingersoll > > It would be really great if we could get feedback to contributors sooner on > many things that are basic (tests exist, patch applies cleanly, etc.) > From Nigel Daley on builds@a.o > {quote} > I revamped the precommit testing in the fall so that it doesn't use Jira > email anymore to trigger a build. The process is controlled by > https://builds.apache.org/hudson/job/PreCommit-Admin/ > which has some documentation up at the top of the job. You can look at the > config of the job (do you have access?) to see what it's doing. Any project > could use this same admin job -- you just need to ask me to add the project > to the Jira filter used by the admin job > (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100 > ) once you have the downstream job(s) setup for your specific project. For > Hadoop we have 3 downstream builds configured which also have some > documentation: > https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/ > https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/ > https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/ > {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3104) Hook up Automated Patch Checking for Lucene/Solr
Hook up Automated Patch Checking for Lucene/Solr Key: LUCENE-3104 URL: https://issues.apache.org/jira/browse/LUCENE-3104 Project: Lucene - Java Issue Type: Task Reporter: Grant Ingersoll It would be really great if we could get feedback to contributors sooner on many things that are basic (tests exist, patch applies cleanly, etc.) >From Nigel Daley on builds@a.o {quote} I revamped the precommit testing in the fall so that it doesn't use Jira email anymore to trigger a build. The process is controlled by https://builds.apache.org/hudson/job/PreCommit-Admin/ which has some documentation up at the top of the job. You can look at the config of the job (do you have access?) to see what it's doing. Any project could use this same admin job -- you just need to ask me to add the project to the Jira filter used by the admin job (https://issues.apache.org/jira/sr/jira.issueviews:searchrequest-xml/12313474/SearchRequest-12313474.xml?tempMax=100 ) once you have the downstream job(s) setup for your specific project. For Hadoop we have 3 downstream builds configured which also have some documentation: https://builds.apache.org/hudson/job/PreCommit-HADOOP-Build/ https://builds.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/ https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/ {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-1942) Ability to select codec per field
[ https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034051#comment-13034051 ] Grant Ingersoll commented on SOLR-1942: --- I thought I would have time last week, but that turned out to not be the case. If you have time, Robert, feel free, otherwise I might be able to get to it later in the week (pending conf. prep). From the sounds of it, it likely just needs to be updated to trunk and then it should be ready to go (we should also doc it on the wiki) > Ability to select codec per field > - > > Key: SOLR-1942 > URL: https://issues.apache.org/jira/browse/SOLR-1942 > Project: Solr > Issue Type: New Feature >Affects Versions: 4.0 >Reporter: Yonik Seeley >Assignee: Grant Ingersoll > Fix For: 4.0 > > Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, > SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch > > > We should use PerFieldCodecWrapper to allow users to select the codec > per-field. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2511) Make it easier to override SolrContentHandler newDocument
[ https://issues.apache.org/jira/browse/SOLR-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved SOLR-2511. --- Resolution: Fixed Fix Version/s: 4.0 3.2 > Make it easier to override SolrContentHandler newDocument > - > > Key: SOLR-2511 > URL: https://issues.apache.org/jira/browse/SOLR-2511 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: SOLR-2511.patch > > > The SolrContentHandler's newDocument method does a variety of things: adds > metadata, literals, content and catpured content. We could split this out > into protected methods for each that makes it easier to override. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2511) Make it easier to override SolrContentHandler newDocument
[ https://issues.apache.org/jira/browse/SOLR-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-2511: -- Attachment: SOLR-2511.patch Going to commit this > Make it easier to override SolrContentHandler newDocument > - > > Key: SOLR-2511 > URL: https://issues.apache.org/jira/browse/SOLR-2511 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Attachments: SOLR-2511.patch > > > The SolrContentHandler's newDocument method does a variety of things: adds > metadata, literals, content and catpured content. We could split this out > into protected methods for each that makes it easier to override. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2511) Make it easier to override SolrContentHandler newDocument
Make it easier to override SolrContentHandler newDocument - Key: SOLR-2511 URL: https://issues.apache.org/jira/browse/SOLR-2511 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor The SolrContentHandler's newDocument method does a variety of things: adds metadata, literals, content and catpured content. We could split this out into protected methods for each that makes it easier to override. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2511) Make it easier to override SolrContentHandler newDocument
[ https://issues.apache.org/jira/browse/SOLR-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned SOLR-2511: - Assignee: Grant Ingersoll > Make it easier to override SolrContentHandler newDocument > - > > Key: SOLR-2511 > URL: https://issues.apache.org/jira/browse/SOLR-2511 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > > The SolrContentHandler's newDocument method does a variety of things: adds > metadata, literals, content and catpured content. We could split this out > into protected methods for each that makes it easier to override. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2502) Add in Examples/Documentation on the new Join functionality
[ https://issues.apache.org/jira/browse/SOLR-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031305#comment-13031305 ] Grant Ingersoll commented on SOLR-2502: --- I think the data extension I've added is fairly reasonable (products are manufactured by a manufacturer). As for the overloading of the tutorial, I don't know. I'm not a UI guy, but I don't think it's too bad at this point. I'm not sure about splitting it out or not, as I think that will create a lot of redundancy in the configuration as well as in the example dir (similar to multicore, DIH, etc. which are a bit clunky now) which then becomes more confusing as it's not clear where to look for what. I think, ideally, we have a more holistic example that seamlessly ties in all the things and presents a single unified app that mirrors a real world application, such as a store. > Add in Examples/Documentation on the new Join functionality > --- > > Key: SOLR-2502 > URL: https://issues.apache.org/jira/browse/SOLR-2502 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: SOLR-2502.patch, SOLR-2502.patch > > > As the title says, add in an example and docs on the new Join functionality > added via SOLR-2272. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2502) Add in Examples/Documentation on the new Join functionality
[ https://issues.apache.org/jira/browse/SOLR-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-2502: -- Attachment: SOLR-2502.patch here's some progress on this. Adds in /browse capability for Join > Add in Examples/Documentation on the new Join functionality > --- > > Key: SOLR-2502 > URL: https://issues.apache.org/jira/browse/SOLR-2502 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: SOLR-2502.patch, SOLR-2502.patch > > > As the title says, add in an example and docs on the new Join functionality > added via SOLR-2272. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2384) Velocity: Add a "toggle all fields" link
[ https://issues.apache.org/jira/browse/SOLR-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved SOLR-2384. --- Resolution: Fixed Fix Version/s: 4.0 3.2 Thanks, Jan! > Velocity: Add a "toggle all fields" link > > > Key: SOLR-2384 > URL: https://issues.apache.org/jira/browse/SOLR-2384 > Project: Solr > Issue Type: Improvement > Components: Response Writers >Reporter: Jan Høydahl >Assignee: Grant Ingersoll > Labels: velocity > Fix For: 3.2, 4.0 > > Attachments: SOLR-2384.patch, SOLR-2384.patch, SOLR-2384.patch > > > When in debug mode in the Velocity /browse GUI, it would be useful to be able > to show all fields for the hits. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2384) Velocity: Add a "toggle all fields" link
[ https://issues.apache.org/jira/browse/SOLR-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned SOLR-2384: - Assignee: Grant Ingersoll > Velocity: Add a "toggle all fields" link > > > Key: SOLR-2384 > URL: https://issues.apache.org/jira/browse/SOLR-2384 > Project: Solr > Issue Type: Improvement > Components: Response Writers >Reporter: Jan Høydahl >Assignee: Grant Ingersoll > Labels: velocity > Attachments: SOLR-2384.patch, SOLR-2384.patch, SOLR-2384.patch > > > When in debug mode in the Velocity /browse GUI, it would be useful to be able > to show all fields for the hits. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Closed] (SOLR-2383) Velocity: Generalize range and date facet display
[ https://issues.apache.org/jira/browse/SOLR-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll closed SOLR-2383. - > Velocity: Generalize range and date facet display > - > > Key: SOLR-2383 > URL: https://issues.apache.org/jira/browse/SOLR-2383 > Project: Solr > Issue Type: Bug > Components: Response Writers >Reporter: Jan Høydahl >Assignee: Grant Ingersoll > Labels: facet, range, velocity > Fix For: 4.0 > > Attachments: SOLR-2383.patch, SOLR-2383.patch, SOLR-2383.patch, > SOLR-2383.patch, SOLR-2383.patch > > > Velocity (/browse) GUI has hardcoded price range facet and a hardcoded > manufacturedate_dt date facet. Need general solution which work for any > facet.range and facet.date. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2383) Velocity: Generalize range and date facet display
[ https://issues.apache.org/jira/browse/SOLR-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned SOLR-2383: - Assignee: Grant Ingersoll > Velocity: Generalize range and date facet display > - > > Key: SOLR-2383 > URL: https://issues.apache.org/jira/browse/SOLR-2383 > Project: Solr > Issue Type: Bug > Components: Response Writers >Reporter: Jan Høydahl >Assignee: Grant Ingersoll > Labels: facet, range, velocity > Fix For: 4.0 > > Attachments: SOLR-2383.patch, SOLR-2383.patch, SOLR-2383.patch, > SOLR-2383.patch, SOLR-2383.patch > > > Velocity (/browse) GUI has hardcoded price range facet and a hardcoded > manufacturedate_dt date facet. Need general solution which work for any > facet.range and facet.date. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2502) Add in Examples/Documentation on the new Join functionality
[ https://issues.apache.org/jira/browse/SOLR-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-2502: -- Attachment: SOLR-2502.patch Here's what I have in mind: Add in manufacturers documents and then tie the various products we have back to the manufacturers. A query then might look like: http://localhost:8983/solr/select/?q={!join%20from=manu_id_s%20to=id}name:ipod > Add in Examples/Documentation on the new Join functionality > --- > > Key: SOLR-2502 > URL: https://issues.apache.org/jira/browse/SOLR-2502 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: SOLR-2502.patch > > > As the title says, add in an example and docs on the new Join functionality > added via SOLR-2272. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2502) Add in Examples/Documentation on the new Join functionality
[ https://issues.apache.org/jira/browse/SOLR-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030119#comment-13030119 ] Grant Ingersoll commented on SOLR-2502: --- Docs are started at: http://wiki.apache.org/solr/Join > Add in Examples/Documentation on the new Join functionality > --- > > Key: SOLR-2502 > URL: https://issues.apache.org/jira/browse/SOLR-2502 > Project: Solr > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > > As the title says, add in an example and docs on the new Join functionality > added via SOLR-2272. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2502) Add in Examples/Documentation on the new Join functionality
Add in Examples/Documentation on the new Join functionality --- Key: SOLR-2502 URL: https://issues.apache.org/jira/browse/SOLR-2502 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor As the title says, add in an example and docs on the new Join functionality added via SOLR-2272. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2272) Join
[ https://issues.apache.org/jira/browse/SOLR-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025574#comment-13025574 ] Grant Ingersoll commented on SOLR-2272: --- I'm not saying it was right to revert etc but I do believe both Yonik and Robert had technical reasons for what they did, even if the solution they arrived at was too drastic. > Join > > > Key: SOLR-2272 > URL: https://issues.apache.org/jira/browse/SOLR-2272 > Project: Solr > Issue Type: New Feature > Components: search >Reporter: Yonik Seeley > Fix For: 4.0 > > Attachments: SOLR-2272.patch, SOLR-2272.patch, SOLR-2272.patch > > > Limited join functionality for Solr, mapping one set of IDs matching a query > to another set of IDs, based on the indexed tokens of the fields. > Example: > fq={!join from=parent_ptr to:parent_id}child_doc:query -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2272) Join
[ https://issues.apache.org/jira/browse/SOLR-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025559#comment-13025559 ] Grant Ingersoll commented on SOLR-2272: --- There are technical reasons, and they aren't necessarily bullshit, it's just that not everyone agrees on them. If you would like, we can link all the back issues. As Yonik has pointed out many times, factoring these things out makes it harder on some parts of the code and while others have pointed out it makes it better for other parts. While I believe it is a net positive despite the downsides, it isn't always cut and dried. As for the private conversations, not all of us are privy to them, so how does the PMC shut them down? Besides, people who live in glass houses, shouldn't throw stones. > Join > > > Key: SOLR-2272 > URL: https://issues.apache.org/jira/browse/SOLR-2272 > Project: Solr > Issue Type: New Feature > Components: search >Reporter: Yonik Seeley > Fix For: 4.0 > > Attachments: SOLR-2272.patch, SOLR-2272.patch, SOLR-2272.patch > > > Limited join functionality for Solr, mapping one set of IDs matching a query > to another set of IDs, based on the indexed tokens of the fields. > Example: > fq={!join from=parent_ptr to:parent_id}child_doc:query -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2949) FastVectorHighlighter FieldTermStack could likely benefit from using TermVectorMapper
[ https://issues.apache.org/jira/browse/LUCENE-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025478#comment-13025478 ] Grant Ingersoll commented on LUCENE-2949: - I haven't looked at the patch yet, but the setExpectations is basically there in case you wish to pre allocate any structures. > FastVectorHighlighter FieldTermStack could likely benefit from using > TermVectorMapper > - > > Key: LUCENE-2949 > URL: https://issues.apache.org/jira/browse/LUCENE-2949 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 3.0.3, 4.0 >Reporter: Grant Ingersoll >Assignee: Koji Sekiguchi >Priority: Minor > Labels: FastVectorHighlighter, Highlighter > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2949.patch > > > Based on my reading of the FieldTermStack constructor that loads the vector > from disk, we could probably save a bunch of time and memory by using the > TermVectorMapper callback mechanism instead of materializing the full array > of terms into memory and then throwing most of them out. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-445: - Summary: Update Handlers abort with bad documents (was: XmlUpdateRequestHandler bad documents mid batch aborts rest of batch) We should fix this for all the update handlers. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Grant Ingersoll > Fix For: Next > > Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024806#comment-13024806 ] Grant Ingersoll commented on SOLR-445: -- What about addressing the other handlers? Any progress on that? > XmlUpdateRequestHandler bad documents mid batch aborts rest of batch > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Grant Ingersoll > Fix For: Next > > Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-2952. - Resolution: Fixed > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2465) QueryElevationComponent should be reloadable w/o commit
QueryElevationComponent should be reloadable w/o commit --- Key: SOLR-2465 URL: https://issues.apache.org/jira/browse/SOLR-2465 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor It would be helpful if you could reload the elevation rules without having to do a commit and reloading the core. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream
[ https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018319#comment-13018319 ] Grant Ingersoll commented on LUCENE-2939: - Mark, Seems like we can move forward with this now that the release is out. Do you have time or do you want me to take it? > Highlighter should try and use maxDocCharsToAnalyze in > WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as > when using CachingTokenStream > > > Key: LUCENE-2939 > URL: https://issues.apache.org/jira/browse/LUCENE-2939 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 3.1.1, 3.2, 4.0 > > Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch, > LUCENE-2939.patch > > > huge documents can be drastically slower than need be because the entire > field is added to the memory index > this cost can be greatly reduced in many cases if we try and respect > maxDocCharsToAnalyze > things can be improved even further by respecting this setting with > CachingTokenStream -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015012#comment-13015012 ] Grant Ingersoll commented on SOLR-2155: --- If the intent is to bring in the "lucene-spatial-playground" into the ASF, why not just start a branch? It will make provenance so much easier. > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3006) Javadocs warnings should fail the build
[ https://issues.apache.org/jira/browse/LUCENE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-3006: Attachment: LUCENE-3006.patch Here's the patch I just committed. > Javadocs warnings should fail the build > --- > > Key: LUCENE-3006 > URL: https://issues.apache.org/jira/browse/LUCENE-3006 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 3.2, 4.0 >Reporter: Grant Ingersoll > Attachments: LUCENE-3006-javadoc-warning-cleanup.patch, > LUCENE-3006.patch, LUCENE-3006.patch, LUCENE-3006.patch > > > We should fail the build when there are javadocs warnings, as this should not > be the Release Manager's job to fix all at once right before the release. > See > http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2981) Review and potentially remove unused/unsupported Contribs
[ https://issues.apache.org/jira/browse/LUCENE-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014060#comment-13014060 ] Grant Ingersoll commented on LUCENE-2981: - +1 for 4.0 I'm fine w/ 3.2, too, FWIW. I can't remember the last time someone submitted a patch or even reported a bug on any of these or even asked about them on user@. > Review and potentially remove unused/unsupported Contribs > - > > Key: LUCENE-2981 > URL: https://issues.apache.org/jira/browse/LUCENE-2981 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2981.patch > > > Some of our contribs appear to be lacking for development/support or are > missing tests. We should review whether they are even pertinent these days > and potentially deprecate and remove them. > One of the things we did in Mahout when bringing in Colt code was to mark all > code that didn't have tests as @deprecated and then we removed the > deprecation once tests were added. Those that didn't get tests added over > about a 6 mos. period of time were removed. > I would suggest taking a hard look at: > ant > db > lucli > swing > (spatial should be gutted to some extent and moved to modules) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3006) Javadocs warnings should fail the build
[ https://issues.apache.org/jira/browse/LUCENE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013678#comment-13013678 ] Grant Ingersoll commented on LUCENE-3006: - I think this is ready to go, but still want to test to make sure contribs are building correctly. Also, committing it will break the build, as we have warnings on trunk! > Javadocs warnings should fail the build > --- > > Key: LUCENE-3006 > URL: https://issues.apache.org/jira/browse/LUCENE-3006 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 3.2, 4.0 >Reporter: Grant Ingersoll > Attachments: LUCENE-3006.patch, LUCENE-3006.patch > > > We should fail the build when there are javadocs warnings, as this should not > be the Release Manager's job to fix all at once right before the release. > See > http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3006) Javadocs warnings should fail the build
[ https://issues.apache.org/jira/browse/LUCENE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-3006: Attachment: LUCENE-3006.patch Adds a property to only fail if failonjavadocwarning is true (which is the default setting). > Javadocs warnings should fail the build > --- > > Key: LUCENE-3006 > URL: https://issues.apache.org/jira/browse/LUCENE-3006 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 3.2, 4.0 >Reporter: Grant Ingersoll > Attachments: LUCENE-3006.patch, LUCENE-3006.patch > > > We should fail the build when there are javadocs warnings, as this should not > be the Release Manager's job to fix all at once right before the release. > See > http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3006) Javadocs warnings should fail the build
[ https://issues.apache.org/jira/browse/LUCENE-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-3006: Attachment: LUCENE-3006.patch Build hook in to javadoc. Doesn't fix the warnings, which need to be fixed. > Javadocs warnings should fail the build > --- > > Key: LUCENE-3006 > URL: https://issues.apache.org/jira/browse/LUCENE-3006 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 3.2, 4.0 >Reporter: Grant Ingersoll > Attachments: LUCENE-3006.patch > > > We should fail the build when there are javadocs warnings, as this should not > be the Release Manager's job to fix all at once right before the release. > See > http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3006) Javadocs warnings should fail the build
Javadocs warnings should fail the build --- Key: LUCENE-3006 URL: https://issues.apache.org/jira/browse/LUCENE-3006 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.2, 4.0 Reporter: Grant Ingersoll We should fail the build when there are javadocs warnings, as this should not be the Release Manager's job to fix all at once right before the release. See http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3005) Define Test Plan for 4.0
Define Test Plan for 4.0 Key: LUCENE-3005 URL: https://issues.apache.org/jira/browse/LUCENE-3005 Project: Lucene - Java Issue Type: Test Affects Versions: 4.0 Reporter: Grant Ingersoll Priority: Blocker Before we can release, we need a test plan that defines what a successful release candidate must do to be accepted. Test plan should be written at http://wiki.apache.org/lucene-java/TestPlans See http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3004) Define Test Plan for 3.2
Define Test Plan for 3.2 Key: LUCENE-3004 URL: https://issues.apache.org/jira/browse/LUCENE-3004 Project: Lucene - Java Issue Type: Test Reporter: Grant Ingersoll Priority: Blocker Before we can release, we need a test plan that defines what a successful release candidate must do to be accepted. Test plan should be written at http://wiki.apache.org/lucene-java/TestPlans See http://www.lucidimagination.com/search/document/14bd01e519f39aff/brainstorming_on_improving_the_release_process -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3000) Lucene release artifacts should be named apache-lucene-*
[ https://issues.apache.org/jira/browse/LUCENE-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013519#comment-13013519 ] Grant Ingersoll commented on LUCENE-3000: - Yeah, I don't think we need to name the jars as such, but the containers (tarballs, etc.) and the directory they are unpackaged into should be. > Lucene release artifacts should be named apache-lucene-* > > > Key: LUCENE-3000 > URL: https://issues.apache.org/jira/browse/LUCENE-3000 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Grant Ingersoll >Priority: Blocker > Fix For: 4.0 > > > Our artifact names should be prefixed with apache-, as in > apache-lucene-4.0-src.tar.gz (or whatever) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2995) factor out a shared spellchecking module
[ https://issues.apache.org/jira/browse/LUCENE-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012609#comment-13012609 ] Grant Ingersoll commented on LUCENE-2995: - bq. if that's in the scope for what you guys have in mind for this module, go ahead. It's in the back of my head. I've got Mahout collab. filtering hooked up through Solr already and it would be dead simple to bring in here, too, but it would fit nicely in this framework. For instance, given a set of search results, it can go do Item-Item recommendations based on doc-ids. bq. suggest +1. Simple, to the point and has room to grow. > factor out a shared spellchecking module > > > Key: LUCENE-2995 > URL: https://issues.apache.org/jira/browse/LUCENE-2995 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-2995.patch > > > In lucene's contrib we have spellchecking support (index-based spellchecker, > directspellchecker, etc). > we also have some things like pluggable comparators. > In solr we have auto-suggest support (with two implementations it looks > like), some good utilities like HighFrequencyDictionary, etc. > I think spellchecking is really important... google has upped the ante to > what users expect. > So I propose we combine all this stuff into a shared modules/spellchecker, > which will make it easier > to refactor and improve the quality. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3000) Lucene release artifacts should be named apache-lucene-*
Lucene release artifacts should be named apache-lucene-* Key: LUCENE-3000 URL: https://issues.apache.org/jira/browse/LUCENE-3000 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Grant Ingersoll Priority: Blocker Fix For: 4.0 Our artifact names should be prefixed with apache-, as in apache-lucene-4.0-src.tar.gz (or whatever) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2995) factor out a shared spellchecking module
[ https://issues.apache.org/jira/browse/LUCENE-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012524#comment-13012524 ] Grant Ingersoll commented on LUCENE-2995: - yeah, I like suggestions or suggester > factor out a shared spellchecking module > > > Key: LUCENE-2995 > URL: https://issues.apache.org/jira/browse/LUCENE-2995 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-2995.patch > > > In lucene's contrib we have spellchecking support (index-based spellchecker, > directspellchecker, etc). > we also have some things like pluggable comparators. > In solr we have auto-suggest support (with two implementations it looks > like), some good utilities like HighFrequencyDictionary, etc. > I think spellchecking is really important... google has upped the ante to > what users expect. > So I propose we combine all this stuff into a shared modules/spellchecker, > which will make it easier > to refactor and improve the quality. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-2999) Modules need release packaging
Modules need release packaging -- Key: LUCENE-2999 URL: https://issues.apache.org/jira/browse/LUCENE-2999 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Grant Ingersoll Priority: Blocker Fix For: 4.0 There is no release packaging targets for the Modules area. This should be similar to the packaging for Lucene and Solr -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2998) Forward Port 3.1 Ant release tasks
[ https://issues.apache.org/jira/browse/LUCENE-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-2998. - Resolution: Fixed > Forward Port 3.1 Ant release tasks > -- > > Key: LUCENE-2998 > URL: https://issues.apache.org/jira/browse/LUCENE-2998 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 3.2, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Blocker > Fix For: 3.2, 4.0 > > > In order to get the release candidate built, I put some changes on 3.1 > build.xml and haven't forward ported them to 3.2 and 4.0. > They revolve around the prepare-release and stage targets in both Lucene and > Solr -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-2998) Forward Port 3.1 Ant release tasks
[ https://issues.apache.org/jira/browse/LUCENE-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned LUCENE-2998: --- Assignee: Grant Ingersoll > Forward Port 3.1 Ant release tasks > -- > > Key: LUCENE-2998 > URL: https://issues.apache.org/jira/browse/LUCENE-2998 > Project: Lucene - Java > Issue Type: Improvement >Affects Versions: 3.2, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Blocker > Fix For: 3.2, 4.0 > > > In order to get the release candidate built, I put some changes on 3.1 > build.xml and haven't forward ported them to 3.2 and 4.0. > They revolve around the prepare-release and stage targets in both Lucene and > Solr -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-2998) Forward Port 3.1 Ant release tasks
Forward Port 3.1 Ant release tasks -- Key: LUCENE-2998 URL: https://issues.apache.org/jira/browse/LUCENE-2998 Project: Lucene - Java Issue Type: Improvement Affects Versions: 3.2, 4.0 Reporter: Grant Ingersoll Priority: Blocker Fix For: 3.2, 4.0 In order to get the release candidate built, I put some changes on 3.1 build.xml and haven't forward ported them to 3.2 and 4.0. They revolve around the prepare-release and stage targets in both Lucene and Solr -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2995) factor out a shared spellchecking module
[ https://issues.apache.org/jira/browse/LUCENE-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012461#comment-13012461 ] Grant Ingersoll commented on LUCENE-2995: - See also SOLR-2080. Spell checking, suggestions and related searches are all part of what I would call a Suggester framework or a Discovery framework. Doesn't need to be done here, but I do think it's easy to have a common API for all of these "suggestions", especially if we can factor in user feedback into them, as right now, we only solve 1/2 of the problem. > factor out a shared spellchecking module > > > Key: LUCENE-2995 > URL: https://issues.apache.org/jira/browse/LUCENE-2995 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-2995.patch > > > In lucene's contrib we have spellchecking support (index-based spellchecker, > directspellchecker, etc). > we also have some things like pluggable comparators. > In solr we have auto-suggest support (with two implementations it looks > like), some good utilities like HighFrequencyDictionary, etc. > I think spellchecking is really important... google has upped the ante to > what users expect. > So I propose we combine all this stuff into a shared modules/spellchecker, > which will make it easier > to refactor and improve the quality. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011615#comment-13011615 ] Grant Ingersoll commented on SOLR-2155: --- Yeah, I agree. I haven't looked at the patch yet. It was my understanding that Chris Male was going to move lucene/contrib/spatial to modules and gut the broken stuff in it. I think there is a separate issue open for that one. Presumably, once spatial and function queries are moved to modules, then we will have a properly working spatial package. I obviously can move it, but I don't have time to do the gutting (we really should have deprecated the tier stuff for this release). > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011603#comment-13011603 ] Grant Ingersoll commented on SOLR-236: -- Keep in mind an alternative approach that scales, but loses some attributes of this patch (total groups for instance) is committed on trunk and will likely be backported to 3.2. > Field collapsing > > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.3 >Reporter: Emmanuel Keller >Assignee: Shalin Shekhar Mangar > Fix For: Next > > Attachments: DocSetScoreCollector.java, > NonAdjacentDocumentCollapser.java, NonAdjacentDocumentCollapserTest.java, > SOLR-236-1_4_1-NPEfix.patch, SOLR-236-1_4_1-paging-totals-working.patch, > SOLR-236-1_4_1.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, > SOLR-236-branch_3x.patch, SOLR-236-distinctFacet.patch, SOLR-236-trunk.patch, > SOLR-236-trunk.patch, SOLR-236-trunk.patch, SOLR-236-trunk.patch, > SOLR-236-trunk.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, > SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, > SOLR-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch, > collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, > collapsing-patch-to-1.3.0-ivan_2.patch, > collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, > field-collapse-4-with-solrj.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, > field-collapse-5.patch, field-collapse-5.patch, > field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, > field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, > field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, > field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, > quasidistributed.additional.patch, solr-236.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given > field to a single entry in the result set. Site collapsing is a special case > of this, where all results for a given web site is collapsed into one or two > entries in the result set, typically with an associated "more documents from > this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before > collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2992) Changes.html is not generated for an svn export of docs
[ https://issues.apache.org/jira/browse/LUCENE-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-2992. - Resolution: Fixed Fix Version/s: 4.0 3.2 3.1 > Changes.html is not generated for an svn export of docs > --- > > Key: LUCENE-2992 > URL: https://issues.apache.org/jira/browse/LUCENE-2992 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 3.1, 3.2, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.1, 3.2, 4.0 > > Attachments: LUCENE-2992.patch > > > When we svn-export for release, the index.html at the top level expects > Changes.html in the docs, which is generated, so we should create it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2992) Changes.html is not generated for an svn export of docs
[ https://issues.apache.org/jira/browse/LUCENE-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2992: Attachment: LUCENE-2992.patch generates the CHANGES.html as part of svn-export. > Changes.html is not generated for an svn export of docs > --- > > Key: LUCENE-2992 > URL: https://issues.apache.org/jira/browse/LUCENE-2992 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 3.1, 3.2, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2992.patch > > > When we svn-export for release, the index.html at the top level expects > Changes.html in the docs, which is generated, so we should create it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-2992) Changes.html is not generated for an svn export of docs
[ https://issues.apache.org/jira/browse/LUCENE-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned LUCENE-2992: --- Assignee: Grant Ingersoll > Changes.html is not generated for an svn export of docs > --- > > Key: LUCENE-2992 > URL: https://issues.apache.org/jira/browse/LUCENE-2992 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 3.1, 3.2, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > > When we svn-export for release, the index.html at the top level expects > Changes.html in the docs, which is generated, so we should create it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-2992) Changes.html is not generated for an svn export of docs
Changes.html is not generated for an svn export of docs --- Key: LUCENE-2992 URL: https://issues.apache.org/jira/browse/LUCENE-2992 Project: Lucene - Java Issue Type: Bug Affects Versions: 3.1, 3.2, 4.0 Reporter: Grant Ingersoll Priority: Minor When we svn-export for release, the index.html at the top level expects Changes.html in the docs, which is generated, so we should create it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned SOLR-2155: - Assignee: Grant Ingersoll > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010866#comment-13010866 ] Grant Ingersoll commented on LUCENE-2952: - I'll fix it, Doron. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-2981) Review and potentially remove unused/unsupported Contribs
Review and potentially remove unused/unsupported Contribs - Key: LUCENE-2981 URL: https://issues.apache.org/jira/browse/LUCENE-2981 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Fix For: 3.2, 4.0 Some of our contribs appear to be lacking for development/support or are missing tests. We should review whether they are even pertinent these days and potentially deprecate and remove them. One of the things we did in Mahout when bringing in Colt code was to mark all code that didn't have tests as @deprecated and then we removed the deprecation once tests were added. Those that didn't get tests added over about a 6 mos. period of time were removed. I would suggest taking a hard look at: ant db lucli swing (spatial should be gutted to some extent and moved to modules) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009169#comment-13009169 ] Grant Ingersoll commented on LUCENE-2952: - Third time is the charm. I don't really care where it lives and it sounds like tools makes sense. Not sure why I didn't notice that sooner. I'll take care of it later today. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Closed: (SOLR-484) Solr Website changes
[ https://issues.apache.org/jira/browse/SOLR-484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll closed SOLR-484. Resolution: Won't Fix > Solr Website changes > > > Key: SOLR-484 > URL: https://issues.apache.org/jira/browse/SOLR-484 > Project: Solr > Issue Type: Bug > Components: documentation >Reporter: Grant Ingersoll >Priority: Minor > Attachments: SOLR-484.patch > > > In looking at the Solr website it has many of the same issues that Lucene > Java did when it comes to ASF policies about nightly builds, etc. concerning > the Javadocs > See > http://lucene.markmail.org/message/a7k7kujxkhwjwfy6?q=nightly+developer+releases+list:org%2Eapache%2Elucene%2Ejava-dev+from:%22Doug+Cutting+(JIRA)%22&page=1 > and > http://lucene.markmail.org/message/vaks6omed4l6buth?q=nightly+developer+releases+list:org%2Eapache%2Elucene%2Ejava-dev+from:%22Doug+Cutting+(JIRA)%22&page=1 > This would suggest a change like Hadoop and Lucene Java did to separate out > the main site, release docs (javadocs, any other?) and developer resources. > Currently the javadocs on the main page are the nightly and should be made > less prominent. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll resolved LUCENE-2952. - Resolution: Fixed Fix Version/s: 4.0 3.2 Assignee: Grant Ingersoll > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008580#comment-13008580 ] Grant Ingersoll commented on LUCENE-2952: - OK, I shuffled some things around, putting the code in test-framework and made the appropriate changes to the builds. Will now backport to 3_x (but not 3.1) > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008537#comment-13008537 ] Grant Ingersoll commented on LUCENE-2952: - I'm just going to move to the test-framework. As Robert points out, if in the future we get more sophisticated about checking the classpath/libs, it will fit well there. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008513#comment-13008513 ] Grant Ingersoll commented on LUCENE-2952: - Actually, the more I think about it, it doesn't belong in modules either. I'm inclined to say a new top level dir called committer-tools (slightly different from dev-tools which are redistributed. committer-tools are not) > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008473#comment-13008473 ] Grant Ingersoll edited comment on LUCENE-2952 at 3/18/11 3:31 PM: -- I'm fine w/ moving it out of dev-tools. I'm not sure about test-framework, which I see more as something people building applications on Lucene/Solr use to test their applications on. How about we put it in modules? As in modules/validation? It is, after all, pertinent to both L & S. was (Author: gsingers): I'm fine w/ moving it out of dev-tools. I'm not sure about test-framework, which I see more as something people building applications on Lucene/Solr use to test their applications on. How about we put it in modules? As in modules/validation? > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008473#comment-13008473 ] Grant Ingersoll commented on LUCENE-2952: - I'm fine w/ moving it out of dev-tools. I'm not sure about test-framework, which I see more as something people building applications on Lucene/Solr use to test their applications on. How about we put it in modules? As in modules/validation? > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Assigned: (SOLR-1942) Ability to select codec per field
[ https://issues.apache.org/jira/browse/SOLR-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll reassigned SOLR-1942: - Assignee: Grant Ingersoll > Ability to select codec per field > - > > Key: SOLR-1942 > URL: https://issues.apache.org/jira/browse/SOLR-1942 > Project: Solr > Issue Type: New Feature >Affects Versions: 4.0 >Reporter: Yonik Seeley >Assignee: Grant Ingersoll > Fix For: 4.0 > > Attachments: SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, > SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch, SOLR-1942.patch > > > We should use PerFieldCodecWrapper to allow users to select the codec > per-field. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2971) Auto Generate our LICENSE.txt and NOTICE.txt files
[ https://issues.apache.org/jira/browse/LUCENE-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007987#comment-13007987 ] Grant Ingersoll commented on LUCENE-2971: - Thanks for the pointers, that should definitely be helpful if and when we add this. > Auto Generate our LICENSE.txt and NOTICE.txt files > -- > > Key: LUCENE-2971 > URL: https://issues.apache.org/jira/browse/LUCENE-2971 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Fix For: 3.2, 4.0 > > > Once LUCENE-2952 is in place, we should be able to automatically generate > Lucene and Solr's LICENSE.txt and NOTICE.txt file (without massive > duplication) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2971) Auto Generate our LICENSE.txt and NOTICE.txt files
Auto Generate our LICENSE.txt and NOTICE.txt files -- Key: LUCENE-2971 URL: https://issues.apache.org/jira/browse/LUCENE-2971 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Once LUCENE-2952 is in place, we should be able to automatically generate Lucene and Solr's LICENSE.txt and NOTICE.txt file (without massive duplication) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2971) Auto Generate our LICENSE.txt and NOTICE.txt files
[ https://issues.apache.org/jira/browse/LUCENE-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2971: Fix Version/s: 4.0 3.2 > Auto Generate our LICENSE.txt and NOTICE.txt files > -- > > Key: LUCENE-2971 > URL: https://issues.apache.org/jira/browse/LUCENE-2971 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Fix For: 3.2, 4.0 > > > Once LUCENE-2952 is in place, we should be able to automatically generate > Lucene and Solr's LICENSE.txt and NOTICE.txt file (without massive > duplication) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch I think this is ready to go. It checks licenses, it checks notices. It leaves room for other validation tasks (version conflicts, etc.) It is fast. It is only called for each top dir: lucene, modules, solr (there is one extra call when modules/benchmark gets called, but I can live with it). I believe all LICENSE, NOTICE files are properly set now. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch latest patch > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2968) SurroundQuery doesn't support SpanNot
[ https://issues.apache.org/jira/browse/LUCENE-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007901#comment-13007901 ] Grant Ingersoll commented on LUCENE-2968: - spn works for me, or simply ! maybe. bq. This could also be an opportunity to port Surround to the new query parser in Lucene. That's up to you. > SurroundQuery doesn't support SpanNot > - > > Key: LUCENE-2968 > URL: https://issues.apache.org/jira/browse/LUCENE-2968 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > > It would be nice if we could do span not in the surround query, as they are > quite useful for keeping searches within a boundary (say a sentence) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007627#comment-13007627 ] Grant Ingersoll commented on SOLR-1725: --- bq. As time passes, the case for moving to Java 6 increases. Solr trunk is on 1.6. > Script based UpdateRequestProcessorFactory > -- > > Key: SOLR-1725 > URL: https://issues.apache.org/jira/browse/SOLR-1725 > Project: Solr > Issue Type: New Feature > Components: update >Affects Versions: 1.4 >Reporter: Uri Boness > Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, > SOLR-1725.patch, SOLR-1725.patch > > > A script based UpdateRequestProcessorFactory (Uses JDK6 script engine > support). The main goal of this plugin is to be able to configure/write > update processors without the need to write and package Java code. > The update request processor factory enables writing update processors in > scripts located in {{solr.solr.home}} directory. The functory accepts one > (mandatory) configuration parameter named {{scripts}} which accepts a > comma-separated list of file names. It will look for these files under the > {{conf}} directory in solr home. When multiple scripts are defined, their > execution order is defined by the lexicographical order of the script file > name (so {{scriptA.js}} will be executed before {{scriptB.js}}). > The script language is resolved based on the script file extension (that is, > a *.js files will be treated as a JavaScript script), therefore an extension > is mandatory. > Each script file is expected to have one or more methods with the same > signature as the methods in the {{UpdateRequestProcessor}} interface. It is > *not* required to define all methods, only those hat are required by the > processing logic. > The following variables are define as global variables for each script: > * {{req}} - The SolrQueryRequest > * {{rsp}}- The SolrQueryResponse > * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2968) SurroundQuery doesn't support SpanNot
SurroundQuery doesn't support SpanNot - Key: LUCENE-2968 URL: https://issues.apache.org/jira/browse/LUCENE-2968 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor It would be nice if we could do span not in the surround query, as they are quite useful for keeping searches within a boundary (say a sentence) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch This minimizes the number of calls to validate (there is still one extra call via the benchmark module since it invokes the common lucene compile target). Also splits it out into Lucene, Solr and Modules. I'd consider it close to good enough at this point. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch This hooks it into compile-core, but has the unfortunate side-effect of being called a whole bunch of times, which is not good. Need to read up on how to avoid that in ant (or if anyone has suggestions, that would be great). Otherwise, I think the baseline functionality is ready to go. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch, > LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch Pretty close to standalone completion. Next step to hook it in. I'm going to commit the license naming normalization now but not the validation code yet. Also, renamed LicenseChecker to DependencyChecker as it might be useful for checking other things like that all jars have version numbers. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (SOLR-2427) UIMA jars are missing version numbers
UIMA jars are missing version numbers - Key: SOLR-2427 URL: https://issues.apache.org/jira/browse/SOLR-2427 Project: Solr Issue Type: Bug Reporter: Grant Ingersoll Priority: Trivial We should have version numbers on the UIMA jar files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005666#comment-13005666 ] Grant Ingersoll commented on LUCENE-2952: - Should note, I've only hooked it up for lucene/lib and solr/lib and not any of the modules or contrib. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch Here's some real progress on this. Works in standalone mode, but is not hooked into the build process yet. > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch, LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2952) Make license checking/maintenance easier/automated
[ https://issues.apache.org/jira/browse/LUCENE-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2952: Attachment: LUCENE-2952.patch No where near being ready, but putting up something to flesh this out a little bit. I don't think it even compiles yet. Idea: Add dev-tools/validation and hook in code into it that does work to validate our systems for things like licenses, etc. It will then be hooked in at compile time for both Lucene and Solr. In this particular case, it will look for license files for each jar file and fail if one is missing. This requires there to be, for every JAR file, a file with the same name and the name of the license.txt appended to it, as in foo.jar.BSD.txt or something like that (still being worked out) > Make license checking/maintenance easier/automated > -- > > Key: LUCENE-2952 > URL: https://issues.apache.org/jira/browse/LUCENE-2952 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Grant Ingersoll >Priority: Minor > Attachments: LUCENE-2952.patch > > > Instead of waiting until release to check licenses are valid, we should make > it a part of our build process to ensure that all dependencies have proper > licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2945) Surround Query doesn't properly handle equals/hashcode
[ https://issues.apache.org/jira/browse/LUCENE-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004655#comment-13004655 ] Grant Ingersoll commented on LUCENE-2945: - The Query class already is cloneable so it needs to support what the QueryUtils is doing. I think it is the anonymous inner class (or in my case, just the inner class) that is the one that matters for all of this. It is an instance of Query and thus needs a proper equals/hashcode. I don't really care about the outer containing classes other than I think it is a misnomer to call them Query classes when they really are factory classes for creating Lucene Queries. > Surround Query doesn't properly handle equals/hashcode > -- > > Key: LUCENE-2945 > URL: https://issues.apache.org/jira/browse/LUCENE-2945 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 3.0.3, 3.1, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.1.1, 4.0 > > Attachments: LUCENE-2945-partial1.patch, LUCENE-2945.patch, > LUCENE-2945.patch, LUCENE-2945.patch > > > In looking at using the surround queries with Solr, I am hitting issues > caused by collisions due to equals/hashcode not being implemented on the > anonymous inner classes that are created by things like DistanceQuery (branch > 3.x, near line 76) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2952) Make license checking/maintenance easier/automated
Make license checking/maintenance easier/automated -- Key: LUCENE-2952 URL: https://issues.apache.org/jira/browse/LUCENE-2952 Project: Lucene - Java Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Instead of waiting until release to check licenses are valid, we should make it a part of our build process to ensure that all dependencies have proper licenses, etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2945) Surround Query doesn't properly handle equals/hashcode
[ https://issues.apache.org/jira/browse/LUCENE-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2945: Affects Version/s: 3.1 Fix Version/s: (was: 3.1) 3.1.1 > Surround Query doesn't properly handle equals/hashcode > -- > > Key: LUCENE-2945 > URL: https://issues.apache.org/jira/browse/LUCENE-2945 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 3.0.3, 3.1, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.1.1, 4.0 > > Attachments: LUCENE-2945-partial1.patch, LUCENE-2945.patch, > LUCENE-2945.patch, LUCENE-2945.patch > > > In looking at using the surround queries with Solr, I am hitting issues > caused by collisions due to equals/hashcode not being implemented on the > anonymous inner classes that are created by things like DistanceQuery (branch > 3.x, near line 76) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2945) Surround Query doesn't properly handle equals/hashcode
[ https://issues.apache.org/jira/browse/LUCENE-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2945: Attachment: LUCENE-2945.patch OK, here's a patch with a test that passes. I'm not entirely thrilled about the implementation of equals/hash on the two inner classes (used to be anonymous) but I do think it works. Namely, I use the syntax of the original query as a string, per Paul's original suggestion as part of the hash/equals. It just seems awkward to have to pass that in solely for this purpose, but I didn't see what other information I had around that would make the object unique from an equals/hash standpoint. I suppose the underlying queries list on the ComposedQuery might work and I can try that if others think it makes more sense. > Surround Query doesn't properly handle equals/hashcode > -- > > Key: LUCENE-2945 > URL: https://issues.apache.org/jira/browse/LUCENE-2945 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 3.0.3, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2945-partial1.patch, LUCENE-2945.patch, > LUCENE-2945.patch, LUCENE-2945.patch > > > In looking at using the surround queries with Solr, I am hitting issues > caused by collisions due to equals/hashcode not being implemented on the > anonymous inner classes that are created by things like DistanceQuery (branch > 3.x, near line 76) -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2945) Surround Query doesn't properly handle equals/hashcode
[ https://issues.apache.org/jira/browse/LUCENE-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2945: Attachment: LUCENE-2945.patch Here's a patch that has a test using QueryUtil that fails. I don't think the getClass() approach is quite right for the base class equals. > Surround Query doesn't properly handle equals/hashcode > -- > > Key: LUCENE-2945 > URL: https://issues.apache.org/jira/browse/LUCENE-2945 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 3.0.3, 4.0 >Reporter: Grant Ingersoll >Assignee: Grant Ingersoll >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2945-partial1.patch, LUCENE-2945.patch, > LUCENE-2945.patch > > > In looking at using the surround queries with Solr, I am hitting issues > caused by collisions due to equals/hashcode not being implemented on the > anonymous inner classes that are created by things like DistanceQuery (branch > 3.x, near line 76) -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2390) Performance of usePhraseHighlighter is terrible on very large Documents, regardless of hl.maxDocCharsToAnalyze
[ https://issues.apache.org/jira/browse/SOLR-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-2390: -- Fix Version/s: 3.1.1 > Performance of usePhraseHighlighter is terrible on very large Documents, > regardless of hl.maxDocCharsToAnalyze > -- > > Key: SOLR-2390 > URL: https://issues.apache.org/jira/browse/SOLR-2390 > Project: Solr > Issue Type: Bug > Components: highlighter >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 3.1.1, 3.2, 4.0 > > > There is a large performance bug here. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream
[ https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2939: Fix Version/s: 3.1.1 > Highlighter should try and use maxDocCharsToAnalyze in > WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as > when using CachingTokenStream > > > Key: LUCENE-2939 > URL: https://issues.apache.org/jira/browse/LUCENE-2939 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 3.1.1, 3.2, 4.0 > > Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch > > > huge documents can be drastically slower than need be because the entire > field is added to the memory index > this cost can be greatly reduced in many cases if we try and respect > maxDocCharsToAnalyze > things can be improved even further by respecting this setting with > CachingTokenStream -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream
[ https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2939: Fix Version/s: (was: 3.1) 3.2 > Highlighter should try and use maxDocCharsToAnalyze in > WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as > when using CachingTokenStream > > > Key: LUCENE-2939 > URL: https://issues.apache.org/jira/browse/LUCENE-2939 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch > > > huge documents can be drastically slower than need be because the entire > field is added to the memory index > this cost can be greatly reduced in many cases if we try and respect > maxDocCharsToAnalyze > things can be improved even further by respecting this setting with > CachingTokenStream -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2390) Performance of usePhraseHighlighter is terrible on very large Documents, regardless of hl.maxDocCharsToAnalyze
[ https://issues.apache.org/jira/browse/SOLR-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-2390: -- Fix Version/s: (was: 3.1) 3.2 > Performance of usePhraseHighlighter is terrible on very large Documents, > regardless of hl.maxDocCharsToAnalyze > -- > > Key: SOLR-2390 > URL: https://issues.apache.org/jira/browse/SOLR-2390 > Project: Solr > Issue Type: Bug > Components: highlighter >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 3.2, 4.0 > > > There is a large performance bug here. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream
[ https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002604#comment-13002604 ] Grant Ingersoll commented on LUCENE-2939: - I think Robert's right, we should not have shoved this in at the last minute, even though it is a pretty big issue for those doing highlighting of larger documents. I'd say we just mark it as 3.1.1 or 3.2. > Highlighter should try and use maxDocCharsToAnalyze in > WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as > when using CachingTokenStream > > > Key: LUCENE-2939 > URL: https://issues.apache.org/jira/browse/LUCENE-2939 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch > > > huge documents can be drastically slower than need be because the entire > field is added to the memory index > this cost can be greatly reduced in many cases if we try and respect > maxDocCharsToAnalyze > things can be improved even further by respecting this setting with > CachingTokenStream -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream
[ https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002579#comment-13002579 ] Grant Ingersoll commented on LUCENE-2939: - I'm OK either way, but it does seem like a pretty big performance bug. > Highlighter should try and use maxDocCharsToAnalyze in > WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as > when using CachingTokenStream > > > Key: LUCENE-2939 > URL: https://issues.apache.org/jira/browse/LUCENE-2939 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch > > > huge documents can be drastically slower than need be because the entire > field is added to the memory index > this cost can be greatly reduced in many cases if we try and respect > maxDocCharsToAnalyze > things can be improved even further by respecting this setting with > CachingTokenStream -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2390) Performance of usePhraseHighlighter is terrible on very large Documents, regardless of hl.maxDocCharsToAnalyze
[ https://issues.apache.org/jira/browse/SOLR-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-2390: -- Fix Version/s: (was: 3.2) 3.1 > Performance of usePhraseHighlighter is terrible on very large Documents, > regardless of hl.maxDocCharsToAnalyze > -- > > Key: SOLR-2390 > URL: https://issues.apache.org/jira/browse/SOLR-2390 > Project: Solr > Issue Type: Bug > Components: highlighter >Reporter: Mark Miller >Assignee: Mark Miller > Fix For: 3.1, 4.0 > > > There is a large performance bug here. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2939) Highlighter should try and use maxDocCharsToAnalyze in WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as when using CachingTokenStream
[ https://issues.apache.org/jira/browse/LUCENE-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2939: Lucene Fields: (was: [New]) Fix Version/s: (was: 3.2) 3.1 > Highlighter should try and use maxDocCharsToAnalyze in > WeightedSpanTermExtractor when adding a new field to MemoryIndex as well as > when using CachingTokenStream > > > Key: LUCENE-2939 > URL: https://issues.apache.org/jira/browse/LUCENE-2939 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Minor > Fix For: 3.1, 4.0 > > Attachments: LUCENE-2939.patch, LUCENE-2939.patch, LUCENE-2939.patch > > > huge documents can be drastically slower than need be because the entire > field is added to the memory index > this cost can be greatly reduced in many cases if we try and respect > maxDocCharsToAnalyze > things can be improved even further by respecting this setting with > CachingTokenStream -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org