Can we move discussion about the implementation to the JIRA issue or the PR?

I'm not a lawyer, so not playing with the GPL fire is the easiest way for
me to avoid getting burned. The regex I have is pretty straightforward, I
do not feel like it is a great cause for alarm.

On Fri, Sep 17, 2021 at 4:18 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Given that we don't ship the code or binaries that involve that python
> library, do we need to care about the license? I'm skeptical of hand rolled
> regex and would rather favour either of the libraries Jan mentioned. Just
> my two cents.
>
> On Sat, 18 Sep, 2021, 12:02 am Mike Drob, <md...@mdrob.com> wrote:
>
>> The second library you linked, Jan, is AGPL. Thank you for continuing to
>> look for alternatives.
>>
>> I have some regular expressions cooked up locally that I think will let
>> us read the split lines going forward, and will put up the patch shortly.
>>
>> On Fri, Sep 17, 2021 at 7:45 AM Yuval Paz <yuval.p...@mail.huji.ac.il>
>> wrote:
>>
>>> Not sure if this is something can be changed easily, but if the problem
>>> is caused by some parsers don't know how to parse line wrapping in the
>>> middle of the Hash why not moving the hash completely to the new line (the
>>> specification allow new line at any point in the value)?
>>>
>>> The commit hash + date comes out to be exactly 71 bytes (including the
>>> space at the start), and it should be a constant size, and by the time the
>>> version will reach 48 bytes we all be probably dead
>>>
>>> On Fri, Sep 17, 2021, 2:47 PM Robert Muir <rcm...@gmail.com> wrote:
>>>
>>>> Sure, but that package is archived/read-only, GPLv3. with 3 watchers
>>>> and 1 star.
>>>>
>>>> On Fri, Sep 17, 2021 at 4:27 AM Jan Høydahl <jan....@cominvent.com>
>>>> wrote:
>>>> >
>>>> > Let's just follow the spec and move on.
>>>> >
>>>> > Just tested this python package, which has no problem parsing the
>>>> problematic manifest https://pypi.org/project/jarmanifest/
>>>> >
>>>> > >>> manifest.getAttributes("/tmp/lucene-manifest.mf")
>>>> > [{'implementationversion': '9.0.0-SNAPSHOT
>>>> de45b68c909815ce5ea7b6b9e1a2ce3375b6334d [snapshot build, details
>>>> omitted]'}]
>>>> >
>>>> > Jan
>>>> >
>>>> > 17. sep. 2021 kl. 09:32 skrev Dawid Weiss <dawid.we...@gmail.com>:
>>>> >
>>>> >
>>>> > We could do a few things to keep everyone happy -
>>>> >
>>>> > 1) keep abbreviated hash in the Implementat-Version and use a
>>>> separate manifest entry to store a full hash.
>>>> > 2) use a longer version for git show (abbrev=num) so that the chance
>>>> of collisions in the future is minimized. It's still not a full hash but a
>>>> > long(er) forced prefix.
>>>> >
>>>> > D.
>>>> >
>>>> > On Fri, Sep 17, 2021 at 12:21 AM Chris Hostetter <
>>>> hossman_luc...@fucit.org> wrote:
>>>> >>
>>>> >>
>>>> >> : I was referring to doing this with languages other than java.
>>>> >> :
>>>> >> : I'm also assuming that exceeding this limit is going to cause
>>>> indirect
>>>> >> : hassles for users of lucene, e.g. breaking various security /
>>>> supply
>>>> >> : chain tools. We know a lot of these are total crap but people in
>>>> the
>>>> >> : corporate world have to suffer under them.
>>>> >>
>>>> >> Just to be clear -- our 'Implementation-Version:' has been exceeding
>>>> the
>>>> >> 72 byte "single line" limit for a LOOOOONG time -- worrying about how
>>>> >> "security / supply chain" tools will handle parsing that line now
>>>> seems
>>>> >> kind of silly...
>>>> >>
>>>> >> If tools can't handle a line wrap in the 8.10 jars, then they haven't
>>>> >> been able to handle the line wrap since we switched from svn to git
>>>> (when
>>>> >> the Implementation Version values switched from being based svn
>>>> version#
>>>> >> to git sha)
>>>> >>
>>>> >> The *ONLY* thing that's new here is where in the value the line wrap
>>>> >> happens (with 8.10.0 it happens in the middle of the SHA) and that
>>>> our
>>>> >> smoketest tool isn't smart enough to parse the values properly.
>>>> >>
>>>> >> This is not even the first time we've even had a conversation about
>>>> the
>>>> >> smoke tester and Implementation Version line wraps: LUCENE-7023.
>>>> >>
>>>> >>
>>>> >> : Its super-easy to use a short hash here and avoid problems.
>>>> >>
>>>> >>
>>>> >> There is however an actual and practical downside to switching our
>>>> >> implementation version to using a "short" SHA, and that's that we
>>>> would
>>>> >> lose the ability to garuntee that the information in the
>>>> >> Implementation-Version uniquely identifies what commit a given jar
>>>> was
>>>> >> built from.  Multiple commits with the same short(end) hash are
>>>> possible
>>>> >> -- Multiple commits with identical (full) commits is not.
>>>> >>
>>>> >> Folks may think that using git tags is useful enough for figuring
>>>> this
>>>> >> out from official releases, but being able to look at the jar
>>>> metadata
>>>> >> from arbitrary builds off of arbitrary branches and sanity check
>>>> where
>>>> >> exactly they come from has been very useful to me for 10+ years.
>>>> >>
>>>> >>
>>>> >> : On Thu, Sep 16, 2021 at 3:03 AM Dawid Weiss <dawid.we...@gmail.com>
>>>> wrote:
>>>> >> : >
>>>> >> : > Jar command doesn't have it, true. But it's fairly trivial to
>>>> do, even
>>>> >> : > with an inline snippet like this one?
>>>> >> : >
>>>> >> : > public class PrintManifest {
>>>> >> : >   public static void main(String[] jars) throws IOException {
>>>> >> : >     for (var jar : jars) {
>>>> >> : >       var manifest = new
>>>> JarFile(Paths.get(jar).toFile()).getManifest();
>>>> >> : >       var attrs = manifest.getMainAttributes();
>>>> >> : >       System.out.println(jar + ": " +
>>>> attrs.getValue("Implementation-Version"));
>>>> >> : >     }
>>>> >> : >   }
>>>> >> : > }
>>>> >> : >
>>>> >> : > I have this in my lucene-core-9.0.0-SNAPSHOT.jar:
>>>> >> : >
>>>> >> : > Implementation-Version: 9.0.0-SNAPSHOT
>>>> de45b68c909815ce5ea7b6b9e1a2ce337
>>>> >> : >  5b6334d [snapshot build, details omitted]
>>>> >> : >
>>>> >> : > and running:
>>>> >> : >
>>>> >> : > java PrintManifest.java lucene-core-9.0.0-SNAPSHOT.jar
>>>> >> : >
>>>> >> : > shows:
>>>> >> : >
>>>> >> : > lucene-core-9.0.0-SNAPSHOT.jar: 9.0.0-SNAPSHOT
>>>> >> : > de45b68c909815ce5ea7b6b9e1a2ce3375b6334d [snapshot build, details
>>>> >> : > omitted]
>>>> >> : >
>>>> >> : > This seems easier to me than trying to remember and keep the
>>>> length of
>>>> >> : > that line shorter than an arbitrary limit.
>>>> >> : >
>>>> >> : > Dawid
>>>> >> : >
>>>> >> : >
>>>> >> : > On Wed, Sep 15, 2021 at 9:46 PM Robert Muir <rcm...@gmail.com>
>>>> wrote:
>>>> >> : > >
>>>> >> : > > But its irrelevant that is "valid" when virtually no tools
>>>> match it.
>>>> >> : > >
>>>> >> : > > In other words, I'd agree with you if the "jar" command had
>>>> some
>>>> >> : > > ability to read these manifests and print stuff to stdout,
>>>> e.g. if
>>>> >> : > > there was ANY interop at all here.
>>>> >> : > >
>>>> >> : > > But there isn't. So IMO it makes no sense to cause confusion
>>>> and chaos
>>>> >> : > > by adding an unnecessarily long git commit hash.
>>>> >> : > >
>>>> >> : > > On Wed, Sep 15, 2021 at 3:26 PM Dawid Weiss <
>>>> dawid.we...@gmail.com> wrote:
>>>> >> : > > >
>>>> >> : > > >
>>>> >> : > > > This is valid manifest line-breaking though... Can we read
>>>> the manifest properly on the smoke tester side somehow (for example, run a
>>>> Java process that reads and extracts the required attribute)? This way we
>>>> wouldn't care about the implementation details of how manifest wraps the
>>>> lines (or escapes characters).
>>>> >> : > > >
>>>> >> : > > > D.
>>>> >> : > > >
>>>> >> : > > > On Wed, Sep 15, 2021 at 8:46 PM Mike Drob <md...@mdrob.com>
>>>> wrote:
>>>> >> : > > >>
>>>> >> : > > >> The benchmark jar has the info we need… sort of. When I
>>>> built it, it has:
>>>> >> : > > >>
>>>> >> : > > >> Implementation-Version: 8.10.0
>>>> 75a5061d3715cc5d93c4cbe4f1fa62bf035eea1
>>>> >> : > > >>  1 - mdrob - 2021-09-15 11:40:36
>>>> >> : > > >>
>>>> >> : > > >>
>>>> >> : > > >> and it’s looking for Implementation-Version: 8.10.0
>>>> 75a5061d3715cc5d93c4cbe4f1fa62bf035eea11 on one line.
>>>> >> : > > >>
>>>> >> : > > >> Because 8.10 is a character longer than 8.9, we happen to
>>>> wrap the last character of the git commit sha. From the manifest spec:
>>>> >> : > > >>
>>>> >> : > > >> No line may be longer than 72 bytes (not characters), in
>>>> its UTF8-encoded form.
>>>> >> : > > >> If a value would make the initial line longer than this, it
>>>> should be continued
>>>> >> : > > >> on extra lines (each starting with a single SPACE).
>>>> >> : > > >>
>>>> >> : > > >> And we were already teetering on the edge of that limit.
>>>> We'll run into this problem again in a few years when we try to release
>>>> version 10.0.0, so solving it now has practical benefits down the line.
>>>> >> : > > >>
>>>> >> : > > >> There's a few options that I can come up with -
>>>> >> : > > >> 1. Use the short-hash when we generate the jar
>>>> >> : > > >> 2. Use the short-hash when we check the contents in the
>>>> smoke test
>>>> >> : > > >> 3. Do some line join magic in the smoke test.
>>>> >> : > > >>
>>>> >> : > > >> I'm leaning towards number 1 as I feel that would still be
>>>> unique enough for our needs, but would like to hear from others as well.
>>>> >> : > > >>
>>>> >> : > > >> On Wed, Sep 15, 2021 at 9:46 AM Timothy potter <
>>>> thelabd...@gmail.com> wrote:
>>>> >> : > > >>>
>>>> >> : > > >>> can someone also please look into that benchmark jar issue?
>>>> >> : > > >>>
>>>> >> : > > >>> Sent from my iPhone
>>>> >> : > > >>>
>>>> >> : > > >>> On Sep 15, 2021, at 9:44 AM, Nhat Nguyen <
>>>> nhat.ngu...@elastic.co.invalid> wrote:
>>>> >> : > > >>>
>>>> >> : > > >>> 
>>>> >> : > > >>> Thanks Mayya and Mike! I will backport it to the 8.10
>>>> branch.
>>>> >> : > > >>>
>>>> >> : > > >>> On Wed, Sep 15, 2021 at 10:12 AM Mike Drob <
>>>> md...@mdrob.com> wrote:
>>>> >> : > > >>>>
>>>> >> : > > >>>> I think since Tim is out on vacation, it's probably not
>>>> too late. That looks like a good fix to have, do we know how long the bug
>>>> has been present?
>>>> >> : > > >>>>
>>>> >> : > > >>>> On Wed, Sep 15, 2021 at 7:56 AM Mayya Sharipova <
>>>> mayya.sharip...@elastic.co.invalid> wrote:
>>>> >> : > > >>>>>
>>>> >> : > > >>>>> Hello everyone,
>>>> >> : > > >>>>> We have discovered a bug and fixed a bug in Lucene sort
>>>> optimization (LUCENE-10106) and would like to merge it to Lucene 8.10 if it
>>>> is not too late.
>>>> >> : > > >>>>> I apologize for the inconvenience, the bug was
>>>> discovered just yesterday.
>>>> >> : > > >>>>>
>>>> >> : > > >>>>> On Tue, Sep 14, 2021 at 9:26 PM Timothy Potter <
>>>> thelabd...@apache.org> wrote:
>>>> >> : > > >>>>>>
>>>> >> : > > >>>>>> Ahem ... unfortunately there will not be an 8.10 RC
>>>> this week. I'm
>>>> >> : > > >>>>>> headed out on vacation tomorrow, back at keys on
>>>> Monday, Sept 20
>>>> >> : > > >>>>>> unless someone else wants to pick up the RM duties
>>>> before then?
>>>> >> : > > >>>>>>
>>>> >> : > > >>>>>> After failing the test suite at various places and
>>>> other weirdness
>>>> >> : > > >>>>>> like .asc files not getting created, I finally got to
>>>> the smoke test
>>>> >> : > > >>>>>> part, which is now failing with:
>>>> >> : > > >>>>>>
>>>> >> : > > >>>>>>   File
>>>> "/Users/tjp/.lucene-releases/8.10.0/lucene-solr/dev-tools/scripts/smokeTestRelease.py",
>>>> >> : > > >>>>>> line 176, in checkJARMetaData
>>>> >> : > > >>>>>>     raise RuntimeError('%s is missing "%s" inside its
>>>> >> : > > >>>>>> META-INF/MANIFEST.MF (wrong git revision?)' % \
>>>> >> : > > >>>>>> RuntimeError: JAR file
>>>> >> : > > >>>>>>
>>>> "/Users/tjp/.lucene-releases/8.10.0/RC1/smoketest/unpack/lucene-8.10.0/benchmark/lucene-benchmark-8.10.0.jar"
>>>> >> : > > >>>>>> is missing "Implementation-Version: 8.10.0
>>>> >> : > > >>>>>> ecf5c747e6df418dd05a18af327c20051f0584d7" inside its
>>>> >> : > > >>>>>> META-INF/MANIFEST.MF (wrong git revision?)
>>>> >> : > > >>>>>>
>>>> >> : > > >>>>>> FWIW, I verified that the other Lucene JAR files have
>>>> this line in
>>>> >> : > > >>>>>> them, such as core:
>>>> >> : > > >>>>>>
>>>> >> : > > >>>>>> Manifest-Version: 1.0
>>>> >> : > > >>>>>> Ant-Version: Apache Ant 1.9.15
>>>> >> : > > >>>>>> Created-By: 1.8.0_265-b01 (AppleJDK-8.0.265.1.1)
>>>> >> : > > >>>>>> Extension-Name: org.apache.lucene
>>>> >> : > > >>>>>> Specification-Title: Lucene Search Engine: core
>>>> >> : > > >>>>>> Specification-Version: 8.10.0
>>>> >> : > > >>>>>> Specification-Vendor: The Apache Software Foundation
>>>> >> : > > >>>>>> Implementation-Title: org.apache.lucene
>>>> >> : > > >>>>>> Implementation-Version: 8.10.0
>>>> ecf5c747e6df418dd05a18af327c20051f0584d
>>>> >> : > > >>>>>>  7 - tjp - 2021-09-14 19:08:42
>>>> >> : > > >>>>>> Implementation-Vendor: The Apache Software Foundation
>>>> >> : > > >>>>>> X-Compile-Source-JDK: 8
>>>> >> : > > >>>>>> X-Compile-Target-JDK: 8
>>>> >> : > > >>>>>> Multi-Release: true
>>>> >> : > > >>>>>>
>>>> >> : > > >>>>>> On Tue, Sep 14, 2021 at 1:21 PM Ishan Chattopadhyaya
>>>> >> : > > >>>>>> <ichattopadhy...@gmail.com> wrote:
>>>> >> : > > >>>>>> >
>>>> >> : > > >>>>>> > All the best, this is the worst step.
>>>> >> : > > >>>>>> >
>>>> >> : > > >>>>>> > On Tue, 14 Sep, 2021, 10:47 pm Timothy Potter, <
>>>> thelabd...@gmail.com> wrote:
>>>> >> : > > >>>>>> >>
>>>> >> : > > >>>>>> >> Building RC1 now ... stay tuned.
>>>> >> : > > >>>>>> >>
>>>> >> : > > >>>>>> >> On Thu, Sep 9, 2021 at 2:30 PM Timothy Potter <
>>>> thelabd...@gmail.com> wrote:
>>>> >> : > > >>>>>> >> >
>>>> >> : > > >>>>>> >> > Thanks for the update Mike!
>>>> >> : > > >>>>>> >> >
>>>> >> : > > >>>>>> >> > I'm backporting SOLR-15620 right now and am
>>>> cooking up a quick PR for
>>>> >> : > > >>>>>> >> > SOLR-15621, which looks like an easy win for the
>>>> issue Cassandra
>>>> >> : > > >>>>>> >> > reported on Slack earlier today.
>>>> >> : > > >>>>>> >> >
>>>> >> : > > >>>>>> >> > Cheers,
>>>> >> : > > >>>>>> >> > Tim
>>>> >> : > > >>>>>> >> >
>>>> >> : > > >>>>>> >> > On Thu, Sep 9, 2021 at 11:32 AM Mike Drob <
>>>> md...@apache.org> wrote:
>>>> >> : > > >>>>>> >> > >
>>>> >> : > > >>>>>> >> > > Hi Tim, I'm still working on SOLR-15555, the
>>>> code and benchmarking
>>>> >> : > > >>>>>> >> > > both look pretty good, but I've got a few last
>>>> unit tests that I need
>>>> >> : > > >>>>>> >> > > to chase down. Hopefully taken care of by today
>>>> or tomorrow, I'll be
>>>> >> : > > >>>>>> >> > > sure to keep you updated though.
>>>> >> : > > >>>>>> >> > >
>>>> >> : > > >>>>>> >> > >
>>>> >> : > > >>>>>> >> > > On Thu, Sep 9, 2021 at 11:39 AM Timothy Potter <
>>>> thelabd...@gmail.com> wrote:
>>>> >> : > > >>>>>> >> > > >
>>>> >> : > > >>>>>> >> > > > I found
>>>> https://issues.apache.org/jira/browse/SOLR-15620 while testing
>>>> >> : > > >>>>>> >> > > > the schema designer. I haven't built the RC
>>>> yet, so going to see if I
>>>> >> : > > >>>>>> >> > > > can get this in today.
>>>> >> : > > >>>>>> >> > > >
>>>> >> : > > >>>>>> >> > > > On Tue, Sep 7, 2021 at 12:36 PM Timothy Potter
>>>> <thelabd...@apache.org> wrote:
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > NOTICE:
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > Branch branch_8_10 has been cut and versions
>>>> updated to 8.11 on stable branch.
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > Please observe the normal rules:
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > * No new features may be committed to the
>>>> branch.
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > * Documentation patches, build patches and
>>>> serious bug fixes may be
>>>> >> : > > >>>>>> >> > > > >   committed to the branch. However, you
>>>> should submit all patches you
>>>> >> : > > >>>>>> >> > > > >   want to commit to Jira first to give
>>>> others the chance to review
>>>> >> : > > >>>>>> >> > > > >   and possibly vote against the patch. Keep
>>>> in mind that it is our
>>>> >> : > > >>>>>> >> > > > >   main intention to keep the branch as
>>>> stable as possible.
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > * All patches that are intended for the
>>>> branch should first be committed
>>>> >> : > > >>>>>> >> > > > >   to the unstable branch, merged into the
>>>> stable branch, and then into
>>>> >> : > > >>>>>> >> > > > >   the current release branch.
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > * Normal unstable and stable branch
>>>> development may continue as usual.
>>>> >> : > > >>>>>> >> > > > >   However, if you plan to commit a big
>>>> change to the unstable branch
>>>> >> : > > >>>>>> >> > > > >   while the branch feature freeze is in
>>>> effect, think twice: can't the
>>>> >> : > > >>>>>> >> > > > >   addition wait a couple more days? Merges
>>>> of bug fixes into the branch
>>>> >> : > > >>>>>> >> > > > >   may become more difficult.
>>>> >> : > > >>>>>> >> > > > >
>>>> >> : > > >>>>>> >> > > > > * Only Jira issues with Fix version 8.10 and
>>>> priority "Blocker" will delay
>>>> >> : > > >>>>>> >> > > > >   a release candidate build.
>>>> >> : > > >>>>>> >> > > > > ----
>>>> >> : > > >>>>>> >> > > >
>>>> >> : > > >>>>>> >> > > >
>>>> ---------------------------------------------------------------------
>>>> >> : > > >>>>>> >> > > > To unsubscribe, e-mail:
>>>> dev-unsubscr...@lucene.apache.org
>>>> >> : > > >>>>>> >> > > > For additional commands, e-mail:
>>>> dev-h...@lucene.apache.org
>>>> >> : > > >>>>>> >> > > >
>>>> >> : > > >>>>>> >> > >
>>>> >> : > > >>>>>> >> > >
>>>> ---------------------------------------------------------------------
>>>> >> : > > >>>>>> >> > > To unsubscribe, e-mail:
>>>> dev-unsubscr...@lucene.apache.org
>>>> >> : > > >>>>>> >> > > For additional commands, e-mail:
>>>> dev-h...@lucene.apache.org
>>>> >> : > > >>>>>> >> > >
>>>> >> : > > >>>>>> >>
>>>> >> : > > >>>>>> >>
>>>> ---------------------------------------------------------------------
>>>> >> : > > >>>>>> >> To unsubscribe, e-mail:
>>>> dev-unsubscr...@solr.apache.org
>>>> >> : > > >>>>>> >> For additional commands, e-mail:
>>>> dev-h...@solr.apache.org
>>>> >> : > > >>>>>> >>
>>>> >> : > > >>>>>>
>>>> >> : > > >>>>>>
>>>> ---------------------------------------------------------------------
>>>> >> : > > >>>>>> To unsubscribe, e-mail:
>>>> dev-unsubscr...@lucene.apache.org
>>>> >> : > > >>>>>> For additional commands, e-mail:
>>>> dev-h...@lucene.apache.org
>>>> >> : > > >>>>>>
>>>> >> : > >
>>>> >> : > >
>>>> ---------------------------------------------------------------------
>>>> >> : > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>>> >> : > > For additional commands, e-mail: dev-h...@solr.apache.org
>>>> >> : > >
>>>> >> : >
>>>> >> : >
>>>> ---------------------------------------------------------------------
>>>> >> : > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>> >> : > For additional commands, e-mail: dev-h...@lucene.apache.org
>>>> >> : >
>>>> >> :
>>>> >> :
>>>> ---------------------------------------------------------------------
>>>> >> : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>> >> : For additional commands, e-mail: dev-h...@lucene.apache.org
>>>> >> :
>>>> >> :
>>>> >>
>>>> >> -Hoss
>>>> >> http://www.lucidworks.com/
>>>> >>
>>>> >> ---------------------------------------------------------------------
>>>> >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>>> >> For additional commands, e-mail: dev-h...@solr.apache.org
>>>> >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>>> For additional commands, e-mail: dev-h...@solr.apache.org
>>>>
>>>>

Reply via email to