I'm ok with it. Thank you, David. Will you put it somewhere on wiki? On Mon, Mar 2, 2020 at 10:07 AM David Smiley <david.w.smi...@gmail.com> wrote:
> I'd like us to reflect on how we categorize issues in CHANGES.txt. We > have these categories: > (Lucene) 'API Changes', 'New Features', 'Improvements', 'Optimizations', > 'Bug Fixes', 'Other' > (Solr) 'New Features', 'Improvements', 'Optimizations', 'Bug Fixes', > 'Other Changes' > (I lifted these from dev-tools/scripts/addVersion.py line 215) > > In particular, I'm often surprised at how some of us categorize New > Features or Improvements that should better be categorized as something > else. I think the root cause of these problems may be that we don't have > JIRA categories that directly align. Furthermore, our dev practices will > typically result in a CHANGES.txt being added out of band from the > code-review process, and thus no peer-review on ideal placement. > Furthermore the message itself is often not code reviewed but should be. > Perhaps we can simply get in the habit of adding a JIRA comment (or GH code > review) what we propose the category & issue summary should be. > > Here is my attempt at a definition for _some_ of these categories. I > don't pretend to think we all agree 100% but it's up for discussion: > ======== > * New Features: A user-visible new capability. Usually opt-in. > > * Improvements: A user-visible improvement to an existing capability that > somehow expands its ability or that which improves the behavior. Not a > refactoring, not an optimization. > > * Optimizations: Something is now more efficient. Usually automatic (not > opt-in). > > * Other: Anything else: Refactorings, tests, build, docs, etc. And > adding log statements. > ======== > > I recommend the following changes to Lucene 8.5: > > These are "Improvements" that I think are better categorized as > "Optimizations" > * LUCENE-9211: Add compression for Binary doc value fields. (Mark Harwood) > * LUCENE-4702: Better compression of terms dictionaries. (Adrien Grand) > * LUCENE-9228: Sort dvUpdates in the term order before applying if they > all update a > single field to the same value. This optimization can reduce the flush > time by around > 20% for the docValues update user cases. (Nhat Nguyen, Adrien Grand, > Simon Willnauer) > * LUCENE-9245: Reduce AutomatonTermsEnum memory usage. (Bruno Roustant, > Robert Muir) > * LUCENE-9237: Faster UniformSplit intersect TermsEnum. (Bruno Roustant) > > These "Improvements" I think are better categorized as "Other": > * LUCENE-9109: Backport some changes from master (except StackWalker) to > improve > TestSecurityManager (Uwe Schindler) > * LUCENE-9110: Backport refactored stack analysis in tests to use > generalized > LuceneTestCase methods (Uwe Schindler) > * LUCENE-9141: Simplify LatLonShapeXQuery API by adding a new abstract > class called LatLonGeometry. Queries are > executed with input objects that extend such interface. (Ignacio Vera) > * LUCENE-9194: Simplify XYShapeXQuery API by adding a new abstract class > called XYGeometry. Queries are > executed with input objects that extend such interface. (Ignacio Vera) > > Maybe this "Other" item should be "Optimization"? (not sure): > * LUCENE-9068: FuzzyQuery builds its Automaton up-front (Alan Woodward, > Mike Drob) > > Solr: > > "New Features" that maybe should be "Improvements": > * SOLR-13892: New "top-level" docValues join implementation (Jason > Gerlowski, Joel Bernstein) > * SOLR-14242: HdfsDirectory now supports indexing geo-points, ranges or > shapes. (Adrien Grand) > > "Improvements" that maybe should be "Optimizations": > * SOLR-13808: filter in BoolQParser and {"bool":{"filter":..}} in Query > DSL are cached by default (Mikhail Khludnev) > > "Improvements" that maybe should be "Other": > * SOLR-14114: Add WARN to Solr log that embedded ZK is not supported in > production (janhoy) > > Thoughts? > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > -- Sincerely yours Mikhail Khludnev