Can we move discussion about the implementation to the JIRA issue or the PR?
I'm not a lawyer, so not playing with the GPL fire is the easiest way for me to avoid getting burned. The regex I have is pretty straightforward, I do not feel like it is a great cause for alarm. On Fri, Sep 17, 2021 at 4:18 PM Ishan Chattopadhyaya < ichattopadhy...@gmail.com> wrote: > Given that we don't ship the code or binaries that involve that python > library, do we need to care about the license? I'm skeptical of hand rolled > regex and would rather favour either of the libraries Jan mentioned. Just > my two cents. > > On Sat, 18 Sep, 2021, 12:02 am Mike Drob, <md...@mdrob.com> wrote: > >> The second library you linked, Jan, is AGPL. Thank you for continuing to >> look for alternatives. >> >> I have some regular expressions cooked up locally that I think will let >> us read the split lines going forward, and will put up the patch shortly. >> >> On Fri, Sep 17, 2021 at 7:45 AM Yuval Paz <yuval.p...@mail.huji.ac.il> >> wrote: >> >>> Not sure if this is something can be changed easily, but if the problem >>> is caused by some parsers don't know how to parse line wrapping in the >>> middle of the Hash why not moving the hash completely to the new line (the >>> specification allow new line at any point in the value)? >>> >>> The commit hash + date comes out to be exactly 71 bytes (including the >>> space at the start), and it should be a constant size, and by the time the >>> version will reach 48 bytes we all be probably dead >>> >>> On Fri, Sep 17, 2021, 2:47 PM Robert Muir <rcm...@gmail.com> wrote: >>> >>>> Sure, but that package is archived/read-only, GPLv3. with 3 watchers >>>> and 1 star. >>>> >>>> On Fri, Sep 17, 2021 at 4:27 AM Jan Høydahl <jan....@cominvent.com> >>>> wrote: >>>> > >>>> > Let's just follow the spec and move on. >>>> > >>>> > Just tested this python package, which has no problem parsing the >>>> problematic manifest https://pypi.org/project/jarmanifest/ >>>> > >>>> > >>> manifest.getAttributes("/tmp/lucene-manifest.mf") >>>> > [{'implementationversion': '9.0.0-SNAPSHOT >>>> de45b68c909815ce5ea7b6b9e1a2ce3375b6334d [snapshot build, details >>>> omitted]'}] >>>> > >>>> > Jan >>>> > >>>> > 17. sep. 2021 kl. 09:32 skrev Dawid Weiss <dawid.we...@gmail.com>: >>>> > >>>> > >>>> > We could do a few things to keep everyone happy - >>>> > >>>> > 1) keep abbreviated hash in the Implementat-Version and use a >>>> separate manifest entry to store a full hash. >>>> > 2) use a longer version for git show (abbrev=num) so that the chance >>>> of collisions in the future is minimized. It's still not a full hash but a >>>> > long(er) forced prefix. >>>> > >>>> > D. >>>> > >>>> > On Fri, Sep 17, 2021 at 12:21 AM Chris Hostetter < >>>> hossman_luc...@fucit.org> wrote: >>>> >> >>>> >> >>>> >> : I was referring to doing this with languages other than java. >>>> >> : >>>> >> : I'm also assuming that exceeding this limit is going to cause >>>> indirect >>>> >> : hassles for users of lucene, e.g. breaking various security / >>>> supply >>>> >> : chain tools. We know a lot of these are total crap but people in >>>> the >>>> >> : corporate world have to suffer under them. >>>> >> >>>> >> Just to be clear -- our 'Implementation-Version:' has been exceeding >>>> the >>>> >> 72 byte "single line" limit for a LOOOOONG time -- worrying about how >>>> >> "security / supply chain" tools will handle parsing that line now >>>> seems >>>> >> kind of silly... >>>> >> >>>> >> If tools can't handle a line wrap in the 8.10 jars, then they haven't >>>> >> been able to handle the line wrap since we switched from svn to git >>>> (when >>>> >> the Implementation Version values switched from being based svn >>>> version# >>>> >> to git sha) >>>> >> >>>> >> The *ONLY* thing that's new here is where in the value the line wrap >>>> >> happens (with 8.10.0 it happens in the middle of the SHA) and that >>>> our >>>> >> smoketest tool isn't smart enough to parse the values properly. >>>> >> >>>> >> This is not even the first time we've even had a conversation about >>>> the >>>> >> smoke tester and Implementation Version line wraps: LUCENE-7023. >>>> >> >>>> >> >>>> >> : Its super-easy to use a short hash here and avoid problems. >>>> >> >>>> >> >>>> >> There is however an actual and practical downside to switching our >>>> >> implementation version to using a "short" SHA, and that's that we >>>> would >>>> >> lose the ability to garuntee that the information in the >>>> >> Implementation-Version uniquely identifies what commit a given jar >>>> was >>>> >> built from. Multiple commits with the same short(end) hash are >>>> possible >>>> >> -- Multiple commits with identical (full) commits is not. >>>> >> >>>> >> Folks may think that using git tags is useful enough for figuring >>>> this >>>> >> out from official releases, but being able to look at the jar >>>> metadata >>>> >> from arbitrary builds off of arbitrary branches and sanity check >>>> where >>>> >> exactly they come from has been very useful to me for 10+ years. >>>> >> >>>> >> >>>> >> : On Thu, Sep 16, 2021 at 3:03 AM Dawid Weiss <dawid.we...@gmail.com> >>>> wrote: >>>> >> : > >>>> >> : > Jar command doesn't have it, true. But it's fairly trivial to >>>> do, even >>>> >> : > with an inline snippet like this one? >>>> >> : > >>>> >> : > public class PrintManifest { >>>> >> : > public static void main(String[] jars) throws IOException { >>>> >> : > for (var jar : jars) { >>>> >> : > var manifest = new >>>> JarFile(Paths.get(jar).toFile()).getManifest(); >>>> >> : > var attrs = manifest.getMainAttributes(); >>>> >> : > System.out.println(jar + ": " + >>>> attrs.getValue("Implementation-Version")); >>>> >> : > } >>>> >> : > } >>>> >> : > } >>>> >> : > >>>> >> : > I have this in my lucene-core-9.0.0-SNAPSHOT.jar: >>>> >> : > >>>> >> : > Implementation-Version: 9.0.0-SNAPSHOT >>>> de45b68c909815ce5ea7b6b9e1a2ce337 >>>> >> : > 5b6334d [snapshot build, details omitted] >>>> >> : > >>>> >> : > and running: >>>> >> : > >>>> >> : > java PrintManifest.java lucene-core-9.0.0-SNAPSHOT.jar >>>> >> : > >>>> >> : > shows: >>>> >> : > >>>> >> : > lucene-core-9.0.0-SNAPSHOT.jar: 9.0.0-SNAPSHOT >>>> >> : > de45b68c909815ce5ea7b6b9e1a2ce3375b6334d [snapshot build, details >>>> >> : > omitted] >>>> >> : > >>>> >> : > This seems easier to me than trying to remember and keep the >>>> length of >>>> >> : > that line shorter than an arbitrary limit. >>>> >> : > >>>> >> : > Dawid >>>> >> : > >>>> >> : > >>>> >> : > On Wed, Sep 15, 2021 at 9:46 PM Robert Muir <rcm...@gmail.com> >>>> wrote: >>>> >> : > > >>>> >> : > > But its irrelevant that is "valid" when virtually no tools >>>> match it. >>>> >> : > > >>>> >> : > > In other words, I'd agree with you if the "jar" command had >>>> some >>>> >> : > > ability to read these manifests and print stuff to stdout, >>>> e.g. if >>>> >> : > > there was ANY interop at all here. >>>> >> : > > >>>> >> : > > But there isn't. So IMO it makes no sense to cause confusion >>>> and chaos >>>> >> : > > by adding an unnecessarily long git commit hash. >>>> >> : > > >>>> >> : > > On Wed, Sep 15, 2021 at 3:26 PM Dawid Weiss < >>>> dawid.we...@gmail.com> wrote: >>>> >> : > > > >>>> >> : > > > >>>> >> : > > > This is valid manifest line-breaking though... Can we read >>>> the manifest properly on the smoke tester side somehow (for example, run a >>>> Java process that reads and extracts the required attribute)? This way we >>>> wouldn't care about the implementation details of how manifest wraps the >>>> lines (or escapes characters). >>>> >> : > > > >>>> >> : > > > D. >>>> >> : > > > >>>> >> : > > > On Wed, Sep 15, 2021 at 8:46 PM Mike Drob <md...@mdrob.com> >>>> wrote: >>>> >> : > > >> >>>> >> : > > >> The benchmark jar has the info we need… sort of. When I >>>> built it, it has: >>>> >> : > > >> >>>> >> : > > >> Implementation-Version: 8.10.0 >>>> 75a5061d3715cc5d93c4cbe4f1fa62bf035eea1 >>>> >> : > > >> 1 - mdrob - 2021-09-15 11:40:36 >>>> >> : > > >> >>>> >> : > > >> >>>> >> : > > >> and it’s looking for Implementation-Version: 8.10.0 >>>> 75a5061d3715cc5d93c4cbe4f1fa62bf035eea11 on one line. >>>> >> : > > >> >>>> >> : > > >> Because 8.10 is a character longer than 8.9, we happen to >>>> wrap the last character of the git commit sha. From the manifest spec: >>>> >> : > > >> >>>> >> : > > >> No line may be longer than 72 bytes (not characters), in >>>> its UTF8-encoded form. >>>> >> : > > >> If a value would make the initial line longer than this, it >>>> should be continued >>>> >> : > > >> on extra lines (each starting with a single SPACE). >>>> >> : > > >> >>>> >> : > > >> And we were already teetering on the edge of that limit. >>>> We'll run into this problem again in a few years when we try to release >>>> version 10.0.0, so solving it now has practical benefits down the line. >>>> >> : > > >> >>>> >> : > > >> There's a few options that I can come up with - >>>> >> : > > >> 1. Use the short-hash when we generate the jar >>>> >> : > > >> 2. Use the short-hash when we check the contents in the >>>> smoke test >>>> >> : > > >> 3. Do some line join magic in the smoke test. >>>> >> : > > >> >>>> >> : > > >> I'm leaning towards number 1 as I feel that would still be >>>> unique enough for our needs, but would like to hear from others as well. >>>> >> : > > >> >>>> >> : > > >> On Wed, Sep 15, 2021 at 9:46 AM Timothy potter < >>>> thelabd...@gmail.com> wrote: >>>> >> : > > >>> >>>> >> : > > >>> can someone also please look into that benchmark jar issue? >>>> >> : > > >>> >>>> >> : > > >>> Sent from my iPhone >>>> >> : > > >>> >>>> >> : > > >>> On Sep 15, 2021, at 9:44 AM, Nhat Nguyen < >>>> nhat.ngu...@elastic.co.invalid> wrote: >>>> >> : > > >>> >>>> >> : > > >>> >>>> >> : > > >>> Thanks Mayya and Mike! I will backport it to the 8.10 >>>> branch. >>>> >> : > > >>> >>>> >> : > > >>> On Wed, Sep 15, 2021 at 10:12 AM Mike Drob < >>>> md...@mdrob.com> wrote: >>>> >> : > > >>>> >>>> >> : > > >>>> I think since Tim is out on vacation, it's probably not >>>> too late. That looks like a good fix to have, do we know how long the bug >>>> has been present? >>>> >> : > > >>>> >>>> >> : > > >>>> On Wed, Sep 15, 2021 at 7:56 AM Mayya Sharipova < >>>> mayya.sharip...@elastic.co.invalid> wrote: >>>> >> : > > >>>>> >>>> >> : > > >>>>> Hello everyone, >>>> >> : > > >>>>> We have discovered a bug and fixed a bug in Lucene sort >>>> optimization (LUCENE-10106) and would like to merge it to Lucene 8.10 if it >>>> is not too late. >>>> >> : > > >>>>> I apologize for the inconvenience, the bug was >>>> discovered just yesterday. >>>> >> : > > >>>>> >>>> >> : > > >>>>> On Tue, Sep 14, 2021 at 9:26 PM Timothy Potter < >>>> thelabd...@apache.org> wrote: >>>> >> : > > >>>>>> >>>> >> : > > >>>>>> Ahem ... unfortunately there will not be an 8.10 RC >>>> this week. I'm >>>> >> : > > >>>>>> headed out on vacation tomorrow, back at keys on >>>> Monday, Sept 20 >>>> >> : > > >>>>>> unless someone else wants to pick up the RM duties >>>> before then? >>>> >> : > > >>>>>> >>>> >> : > > >>>>>> After failing the test suite at various places and >>>> other weirdness >>>> >> : > > >>>>>> like .asc files not getting created, I finally got to >>>> the smoke test >>>> >> : > > >>>>>> part, which is now failing with: >>>> >> : > > >>>>>> >>>> >> : > > >>>>>> File >>>> "/Users/tjp/.lucene-releases/8.10.0/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >>>> >> : > > >>>>>> line 176, in checkJARMetaData >>>> >> : > > >>>>>> raise RuntimeError('%s is missing "%s" inside its >>>> >> : > > >>>>>> META-INF/MANIFEST.MF (wrong git revision?)' % \ >>>> >> : > > >>>>>> RuntimeError: JAR file >>>> >> : > > >>>>>> >>>> "/Users/tjp/.lucene-releases/8.10.0/RC1/smoketest/unpack/lucene-8.10.0/benchmark/lucene-benchmark-8.10.0.jar" >>>> >> : > > >>>>>> is missing "Implementation-Version: 8.10.0 >>>> >> : > > >>>>>> ecf5c747e6df418dd05a18af327c20051f0584d7" inside its >>>> >> : > > >>>>>> META-INF/MANIFEST.MF (wrong git revision?) >>>> >> : > > >>>>>> >>>> >> : > > >>>>>> FWIW, I verified that the other Lucene JAR files have >>>> this line in >>>> >> : > > >>>>>> them, such as core: >>>> >> : > > >>>>>> >>>> >> : > > >>>>>> Manifest-Version: 1.0 >>>> >> : > > >>>>>> Ant-Version: Apache Ant 1.9.15 >>>> >> : > > >>>>>> Created-By: 1.8.0_265-b01 (AppleJDK-8.0.265.1.1) >>>> >> : > > >>>>>> Extension-Name: org.apache.lucene >>>> >> : > > >>>>>> Specification-Title: Lucene Search Engine: core >>>> >> : > > >>>>>> Specification-Version: 8.10.0 >>>> >> : > > >>>>>> Specification-Vendor: The Apache Software Foundation >>>> >> : > > >>>>>> Implementation-Title: org.apache.lucene >>>> >> : > > >>>>>> Implementation-Version: 8.10.0 >>>> ecf5c747e6df418dd05a18af327c20051f0584d >>>> >> : > > >>>>>> 7 - tjp - 2021-09-14 19:08:42 >>>> >> : > > >>>>>> Implementation-Vendor: The Apache Software Foundation >>>> >> : > > >>>>>> X-Compile-Source-JDK: 8 >>>> >> : > > >>>>>> X-Compile-Target-JDK: 8 >>>> >> : > > >>>>>> Multi-Release: true >>>> >> : > > >>>>>> >>>> >> : > > >>>>>> On Tue, Sep 14, 2021 at 1:21 PM Ishan Chattopadhyaya >>>> >> : > > >>>>>> <ichattopadhy...@gmail.com> wrote: >>>> >> : > > >>>>>> > >>>> >> : > > >>>>>> > All the best, this is the worst step. >>>> >> : > > >>>>>> > >>>> >> : > > >>>>>> > On Tue, 14 Sep, 2021, 10:47 pm Timothy Potter, < >>>> thelabd...@gmail.com> wrote: >>>> >> : > > >>>>>> >> >>>> >> : > > >>>>>> >> Building RC1 now ... stay tuned. >>>> >> : > > >>>>>> >> >>>> >> : > > >>>>>> >> On Thu, Sep 9, 2021 at 2:30 PM Timothy Potter < >>>> thelabd...@gmail.com> wrote: >>>> >> : > > >>>>>> >> > >>>> >> : > > >>>>>> >> > Thanks for the update Mike! >>>> >> : > > >>>>>> >> > >>>> >> : > > >>>>>> >> > I'm backporting SOLR-15620 right now and am >>>> cooking up a quick PR for >>>> >> : > > >>>>>> >> > SOLR-15621, which looks like an easy win for the >>>> issue Cassandra >>>> >> : > > >>>>>> >> > reported on Slack earlier today. >>>> >> : > > >>>>>> >> > >>>> >> : > > >>>>>> >> > Cheers, >>>> >> : > > >>>>>> >> > Tim >>>> >> : > > >>>>>> >> > >>>> >> : > > >>>>>> >> > On Thu, Sep 9, 2021 at 11:32 AM Mike Drob < >>>> md...@apache.org> wrote: >>>> >> : > > >>>>>> >> > > >>>> >> : > > >>>>>> >> > > Hi Tim, I'm still working on SOLR-15555, the >>>> code and benchmarking >>>> >> : > > >>>>>> >> > > both look pretty good, but I've got a few last >>>> unit tests that I need >>>> >> : > > >>>>>> >> > > to chase down. Hopefully taken care of by today >>>> or tomorrow, I'll be >>>> >> : > > >>>>>> >> > > sure to keep you updated though. >>>> >> : > > >>>>>> >> > > >>>> >> : > > >>>>>> >> > > >>>> >> : > > >>>>>> >> > > On Thu, Sep 9, 2021 at 11:39 AM Timothy Potter < >>>> thelabd...@gmail.com> wrote: >>>> >> : > > >>>>>> >> > > > >>>> >> : > > >>>>>> >> > > > I found >>>> https://issues.apache.org/jira/browse/SOLR-15620 while testing >>>> >> : > > >>>>>> >> > > > the schema designer. I haven't built the RC >>>> yet, so going to see if I >>>> >> : > > >>>>>> >> > > > can get this in today. >>>> >> : > > >>>>>> >> > > > >>>> >> : > > >>>>>> >> > > > On Tue, Sep 7, 2021 at 12:36 PM Timothy Potter >>>> <thelabd...@apache.org> wrote: >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > NOTICE: >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > Branch branch_8_10 has been cut and versions >>>> updated to 8.11 on stable branch. >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > Please observe the normal rules: >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > * No new features may be committed to the >>>> branch. >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > * Documentation patches, build patches and >>>> serious bug fixes may be >>>> >> : > > >>>>>> >> > > > > committed to the branch. However, you >>>> should submit all patches you >>>> >> : > > >>>>>> >> > > > > want to commit to Jira first to give >>>> others the chance to review >>>> >> : > > >>>>>> >> > > > > and possibly vote against the patch. Keep >>>> in mind that it is our >>>> >> : > > >>>>>> >> > > > > main intention to keep the branch as >>>> stable as possible. >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > * All patches that are intended for the >>>> branch should first be committed >>>> >> : > > >>>>>> >> > > > > to the unstable branch, merged into the >>>> stable branch, and then into >>>> >> : > > >>>>>> >> > > > > the current release branch. >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > * Normal unstable and stable branch >>>> development may continue as usual. >>>> >> : > > >>>>>> >> > > > > However, if you plan to commit a big >>>> change to the unstable branch >>>> >> : > > >>>>>> >> > > > > while the branch feature freeze is in >>>> effect, think twice: can't the >>>> >> : > > >>>>>> >> > > > > addition wait a couple more days? Merges >>>> of bug fixes into the branch >>>> >> : > > >>>>>> >> > > > > may become more difficult. >>>> >> : > > >>>>>> >> > > > > >>>> >> : > > >>>>>> >> > > > > * Only Jira issues with Fix version 8.10 and >>>> priority "Blocker" will delay >>>> >> : > > >>>>>> >> > > > > a release candidate build. >>>> >> : > > >>>>>> >> > > > > ---- >>>> >> : > > >>>>>> >> > > > >>>> >> : > > >>>>>> >> > > > >>>> --------------------------------------------------------------------- >>>> >> : > > >>>>>> >> > > > To unsubscribe, e-mail: >>>> dev-unsubscr...@lucene.apache.org >>>> >> : > > >>>>>> >> > > > For additional commands, e-mail: >>>> dev-h...@lucene.apache.org >>>> >> : > > >>>>>> >> > > > >>>> >> : > > >>>>>> >> > > >>>> >> : > > >>>>>> >> > > >>>> --------------------------------------------------------------------- >>>> >> : > > >>>>>> >> > > To unsubscribe, e-mail: >>>> dev-unsubscr...@lucene.apache.org >>>> >> : > > >>>>>> >> > > For additional commands, e-mail: >>>> dev-h...@lucene.apache.org >>>> >> : > > >>>>>> >> > > >>>> >> : > > >>>>>> >> >>>> >> : > > >>>>>> >> >>>> --------------------------------------------------------------------- >>>> >> : > > >>>>>> >> To unsubscribe, e-mail: >>>> dev-unsubscr...@solr.apache.org >>>> >> : > > >>>>>> >> For additional commands, e-mail: >>>> dev-h...@solr.apache.org >>>> >> : > > >>>>>> >> >>>> >> : > > >>>>>> >>>> >> : > > >>>>>> >>>> --------------------------------------------------------------------- >>>> >> : > > >>>>>> To unsubscribe, e-mail: >>>> dev-unsubscr...@lucene.apache.org >>>> >> : > > >>>>>> For additional commands, e-mail: >>>> dev-h...@lucene.apache.org >>>> >> : > > >>>>>> >>>> >> : > > >>>> >> : > > >>>> --------------------------------------------------------------------- >>>> >> : > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >>>> >> : > > For additional commands, e-mail: dev-h...@solr.apache.org >>>> >> : > > >>>> >> : > >>>> >> : > >>>> --------------------------------------------------------------------- >>>> >> : > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>>> >> : > For additional commands, e-mail: dev-h...@lucene.apache.org >>>> >> : > >>>> >> : >>>> >> : >>>> --------------------------------------------------------------------- >>>> >> : To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>>> >> : For additional commands, e-mail: dev-h...@lucene.apache.org >>>> >> : >>>> >> : >>>> >> >>>> >> -Hoss >>>> >> http://www.lucidworks.com/ >>>> >> >>>> >> --------------------------------------------------------------------- >>>> >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >>>> >> For additional commands, e-mail: dev-h...@solr.apache.org >>>> > >>>> > >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org >>>> For additional commands, e-mail: dev-h...@solr.apache.org >>>> >>>>