Am Di., 4. Jan. 2022 um 21:41 Uhr schrieb Julian Hyde < [email protected]>:
> No, I don’t think it matters in this case. But consistent use of ROOT is > useful, because someone in future might be tracking down a bug, and if they > see ENGLISH it’s one more hypothesis they’d have to discount. > That makes sense. Just let me know if you see the need for other changes to that PR. I may look into some of the other dependencies you mentioned as being rarely used, as I find the time. Any thoughts on this one: > Re Janino, is there any reason for not using the compiler implementation coming with the JDK ? Thanks, --Gunnar > > On Jan 4, 2022, at 12:31 PM, Gunnar Morling > <[email protected]> wrote: > > > > Am Di., 4. Jan. 2022 um 20:51 Uhr schrieb Julian Hyde < > > [email protected]>: > > > >> If a method needs a locale, always pass Locale.ROOT. > >> > > > > Ok, I've changed it accordingly. Do you think it actually matters for the > > case at hand? > > > >> On Jan 4, 2022, at 9:13 AM, Gunnar Morling > >> <[email protected]> wrote: > >>> > >>> Am Di., 4. Jan. 2022 um 09:39 Uhr schrieb Julian Hyde < > >>> [email protected]>: > >>> > >>>> Please log a jira case for the commons-lang3 change. > >>> > >>> > >>> Logged https://issues.apache.org/jira/browse/CALCITE-4975. > >>> > >>> > >>>> It looks good. One or two places I’d create a function rather than > >> having > >>>> a blob of code inline. > >>>> > >>> > >>> Sure; just let me know where exactly. > >>> > >>> Your use of default locale in the CSV adapter looks wrong. Calcite is a > >>>> server, so never uses default locale or time zone. In fact we use > >>>> forbiddenApis to check, so we should add a few methods to its > >> configuration. > >>> > >>> > >>> Yeah, I had been pondering about this; I don't think it matters, the > >> locale > >>> should not make any difference for these specific formats, as they > don't > >>> contain any locale-specific patterns (unlike, say, "MMM"). I've changed > >> it > >>> to Locale.ENGLISH now, just in case. In fact, I wanted to use the > >>> ofPattern() method without the Locale parameter, but this failed the > >>> forbiddenApis check as well :) > >>> > >>> Julian > >>> > >>> > >>> Best, > >>> > >>> --Gunnar > >>> > >>> > >>>> > >>>> > >>>>> On Jan 3, 2022, at 12:30 PM, Gunnar Morling > >>>> <[email protected]> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> Thanks a lot for this, I think trimming down the dependencies of > >> Calcite > >>>>> will be of great help for its adoption. > >>>>> > >>>>>> So, the easiest way to reduce dependencies would be to make certain > >>>>> classes of SQL functions optional (i.e. move them out of core). > >>>>> > >>>>> That sounds like a good idea. > >>>>> > >>>>>> commons-lang3, commons-codec, commons-io are probably only used in > one > >>>> or > >>>>> two places each; > >>>>> > >>>>> To make some progress there, I've created PR > >>>>> https://github.com/apache/calcite/pull/2672 which removes the > >>>> dependency to > >>>>> commons-lang3 from the entire code base. Any feedback on that PR > would > >>>>> be appreciated (I still need to log an issue, but wanted to share > >> quickly > >>>>> what I had). I can try and take a look at the other ones, if there's > >>>>> interest in this. > >>>>> > >>>>> Re Janino, is there any reason for not using the compiler > >> implementation > >>>>> coming with the JDK? Alternatively, one could also consider to > generate > >>>>> byte code directly using ASM, which wouldn't be beneficial > >>>> dependency-wise, > >>>>> but it may improve the performance of this generation step (I still > >> lack > >>>>> insight why this is done in the first place). > >>>>> > >>>>> Thanks, > >>>>> > >>>>> --Gunnar > >>>>> > >>>>>> Am Fr., 31. Dez. 2021 um 00:56 Uhr schrieb Julian Hyde < > >>>>>> [email protected]>: > >>>>>> > >>>>>> Regarding dependencies. Here are the runtime dependencies from > >>>>>> core/build.gradle.kts (ignoring test and annotation libraries): > >>>>>> > >>>>>> * api("com.esri.geometry:esri-geometry-api") > >>>>>> * api("com.fasterxml.jackson.core:jackson-annotations") > >>>>>> * api("com.google.guava:guava") > >>>>>> * api("org.apache.calcite.avatica:avatica-core") > >>>>>> * api("org.slf4j:slf4j-api") > >>>>>> * implementation("com.fasterxml.jackson.core:jackson-core") > >>>>>> * implementation("com.fasterxml.jackson.core:jackson-databind") > >>>>>> * > >>>>>> > >>>> > >> > implementation("com.fasterxml.jackson.dataformat:jackson-dataformat-yaml") > >>>>>> * implementation("com.google.uzaygezen:uzaygezen-core") > >>>>>> * implementation("com.jayway.jsonpath:json-path") > >>>>>> * implementation("com.yahoo.datasketches:sketches-core") > >>>>>> * implementation("commons-codec:commons-codec") > >>>>>> * implementation("net.hydromatic:aggdesigner-algorithm") > >>>>>> * implementation("org.apache.commons:commons-dbcp2") > >>>>>> * implementation("org.apache.commons:commons-lang3") > >>>>>> * implementation("commons-io:commons-io") > >>>>>> * implementation("org.codehaus.janino:commons-compiler") > >>>>>> * implementation("org.codehaus.janino:janino") > >>>>>> > >>>>>> A few libraries are used only for a narrow range of functionality: > >>>>>> * esri-geometry and uzaygezen-core are used by geospatial functions; > >>>>>> * sketches-core is used by the HLL aggregate functions; > >>>>>> * json-path is used by some JSON functions; > >>>>>> * jackson-core, jackson-databind, jackson-dataformat-yaml are used > to > >>>>>> load models, and to serialize RelNodes to and from JSON; > >>>>>> * commons-lang3, commons-codec, commons-io are probably only used in > >> one > >>>>>> or two places each; > >>>>>> * aggdesigner-algotihm is used for recommending materialized views. > >>>>>> > >>>>>> So, the easiest way to reduce dependencies would be to make certain > >>>>>> classes of SQL functions optional (i.e. move them out of core). > >>>>>> > >>>>>> Julian > >>>>>> > >>>>>> > >>>>>> > >>>>>>>> On Dec 29, 2021, at 1:30 PM, Jacques Nadeau <[email protected]> > >>>> wrote: > >>>>>>> > >>>>>>> WRT SBOM (Julian): My general experience is that most large orgs > use > >>>>>>> scanners now (either open or closed) and they will scan whether you > >>>> have > >>>>>> a > >>>>>>> bill of materials or not. I wouldn't worry about adding something > >>>>>>> additional. > >>>>>>> > >>>>>>> WRT too many dependencies (Gunnar): I completely agree with the > >> general > >>>>>>> feeling of too many (and with Guava, jackson less so). I think the > >> core > >>>>>>> challenge (no pun intended) is that calcite-core is really a lot of > >>>>>>> different components. For example, I have frequently wished that > >>>> parser, > >>>>>>> planner and enumerable were separate modules. And if they were, I'd > >>>> guess > >>>>>>> that each would have a narrower dependency range. I've also wished > >> many > >>>>>>> times that runtime compilation was an optional addon as opposed to > >>>>>>> required/coupled in the core... > >>>>>>> > >>>>>>> When I've thought about how to dissect in the past, I think the big > >>>>>>> challenge would be tests, where things are sometimes mixed > together. > >>>>>>> Breaking change possibilities could be at least somewhat mitigated > by > >>>>>>> moving classes but not packages. > >>>>>>> > >>>>>>> On Wed, Dec 29, 2021 at 1:51 AM Gunnar Morling > >>>>>>> <[email protected]> wrote: > >>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> In a way, Calcite's build configuration as well as the published > POM > >>>>>> could > >>>>>>>> be considered as such an SBOM? In particular when looking at the > >>>> latter > >>>>>>>> through services like mvnrepository [1], you get quite a good view > >> on > >>>>>> the > >>>>>>>> dependency versions, licenses, any potential CVEs, etc. I think > this > >>>>>> should > >>>>>>>> satisfy most user needs around this? Or are you referring to the > >>>> notion > >>>>>> of > >>>>>>>> Maven BOM POMs specifically [2], i.e. the notion of publishing a > POM > >>>>>> with > >>>>>>>> all the Calcite component versions which people can then use with > >>>>>> Maven's > >>>>>>>> import scope (there should be something comparable for Gradle)? If > >> so, > >>>>>> that > >>>>>>>> could be useful for users working with multiple Calcite > components, > >>>>>> though > >>>>>>>> I think the usability improvement provided by such BOM POM > wouldn't > >> be > >>>>>>>> huge. > >>>>>>>> > >>>>>>>> I wanted to bring up a related matter though. Coming to Calcite > as a > >>>>>> user > >>>>>>>> just recently (loving the possibilities it provides!), I was > >> surprised > >>>>>> by > >>>>>>>> the large number of dependencies of the project. It looks like > 1.29 > >>>>>>>> improves that a little bit (no more kotlin-stdlib, no more > >> transitive > >>>>>>>> dependency to log4j 1.x), but the transitive hull of all > >> dependencies > >>>> of > >>>>>>>> calcite-core still is quite big. I lack insight about what the > >>>> different > >>>>>>>> dependencies are used for; but as an application developer, Guava > >> for > >>>>>>>> instance is a dependency which I'd prefer to not get pushed onto > the > >>>>>>>> classpath transitively. Jackson is another heavy one; depending on > >> how > >>>>>> it's > >>>>>>>> used, perhaps this could be pushed into some separate module which > >>>> users > >>>>>>>> could optionally pull in? That'd help to avoid having it around > >> when > >>>>>> users > >>>>>>>> work with other JSON libs themselves and don't require JSON > support > >> in > >>>>>>>> Calcite. > >>>>>>>> > >>>>>>>> From a supply chain perspective, the less transitive dependencies > a > >>>>>> library > >>>>>>>> like Calcite introduces to my project, the better IMHO. Less > >> potential > >>>>>> for > >>>>>>>> version conflicts with my own (or other transitive) dependencies, > >> and > >>>>>> also > >>>>>>>> less potential for introducing CVEs to the dependency graph, as > e.g. > >>>> in > >>>>>> the > >>>>>>>> case of the Guava version currently used by Calcite; I suppose it > >> does > >>>>>> not > >>>>>>>> impact the usage in Calcite, but these things tend to be tricky to > >>>>>> reason > >>>>>>>> about, and typical CVE reporting tooling will now create a warning > >>>> for a > >>>>>>>> project using Calcite, no matter whether that specific issue > >> actually > >>>>>> is a > >>>>>>>> problem or not. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> > >>>>>>>> --Gunnar > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> > >>>>>> > >>>> > >> > https://mvnrepository.com/artifact/org.apache.calcite/calcite-core/1.29.0 > >>>>>>>> [2] > >>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#bill-of-materials-bom-poms > >>>>>>>> > >>>>>>>> > >>>>>>>> Am Mi., 29. Dez. 2021 um 02:27 Uhr schrieb Julian Hyde < > >>>>>>>> [email protected]>: > >>>>>>>> > >>>>>>>>> In the wake of the log4j CVEs [1], people are asking how to > improve > >>>> the > >>>>>>>>> security of open source projects, and one idea is to provide a > SBOM > >>>>>>>>> (Software Bill of Materials) [2] along with each release. > >>>>>>>>> > >>>>>>>>> I had not heard of SBOM until a couple of days ago. Is anyone on > >> this > >>>>>>>> list > >>>>>>>>> familiar with SBOMs and their use? Should Calcite be providing an > >>>> SBOM? > >>>>>>>> Are > >>>>>>>>> people aware of SBOM initiatives in other projects? What, in your > >>>>>>>> opinion, > >>>>>>>>> is the priority of this issue? > >>>>>>>>> > >>>>>>>>> Julian > >>>>>>>>> > >>>>>>>>> [1] > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >> > https://thehackernews.com/2021/12/second-log4j-vulnerability-cve-2021.html > >>>>>>>>> > >>>>>>>>> [2] https://en.wikipedia.org/wiki/Software_bill_of_materials > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>> > >> > >> > >
