Am Di., 4. Jan. 2022 um 09:39 Uhr schrieb Julian Hyde < [email protected]>:
> Please log a jira case for the commons-lang3 change. Logged https://issues.apache.org/jira/browse/CALCITE-4975. > It looks good. One or two places I’d create a function rather than having > a blob of code inline. > Sure; just let me know where exactly. Your use of default locale in the CSV adapter looks wrong. Calcite is a > server, so never uses default locale or time zone. In fact we use > forbiddenApis to check, so we should add a few methods to its configuration. Yeah, I had been pondering about this; I don't think it matters, the locale should not make any difference for these specific formats, as they don't contain any locale-specific patterns (unlike, say, "MMM"). I've changed it to Locale.ENGLISH now, just in case. In fact, I wanted to use the ofPattern() method without the Locale parameter, but this failed the forbiddenApis check as well :) Julian Best, --Gunnar > > > > On Jan 3, 2022, at 12:30 PM, Gunnar Morling > <[email protected]> wrote: > > > > Hi, > > > > Thanks a lot for this, I think trimming down the dependencies of Calcite > > will be of great help for its adoption. > > > >> So, the easiest way to reduce dependencies would be to make certain > > classes of SQL functions optional (i.e. move them out of core). > > > > That sounds like a good idea. > > > >> commons-lang3, commons-codec, commons-io are probably only used in one > or > > two places each; > > > > To make some progress there, I've created PR > > https://github.com/apache/calcite/pull/2672 which removes the > dependency to > > commons-lang3 from the entire code base. Any feedback on that PR would > > be appreciated (I still need to log an issue, but wanted to share quickly > > what I had). I can try and take a look at the other ones, if there's > > interest in this. > > > > Re Janino, is there any reason for not using the compiler implementation > > coming with the JDK? Alternatively, one could also consider to generate > > byte code directly using ASM, which wouldn't be beneficial > dependency-wise, > > but it may improve the performance of this generation step (I still lack > > insight why this is done in the first place). > > > > Thanks, > > > > --Gunnar > > > >> Am Fr., 31. Dez. 2021 um 00:56 Uhr schrieb Julian Hyde < > >> [email protected]>: > >> > >> Regarding dependencies. Here are the runtime dependencies from > >> core/build.gradle.kts (ignoring test and annotation libraries): > >> > >> * api("com.esri.geometry:esri-geometry-api") > >> * api("com.fasterxml.jackson.core:jackson-annotations") > >> * api("com.google.guava:guava") > >> * api("org.apache.calcite.avatica:avatica-core") > >> * api("org.slf4j:slf4j-api") > >> * implementation("com.fasterxml.jackson.core:jackson-core") > >> * implementation("com.fasterxml.jackson.core:jackson-databind") > >> * > >> > implementation("com.fasterxml.jackson.dataformat:jackson-dataformat-yaml") > >> * implementation("com.google.uzaygezen:uzaygezen-core") > >> * implementation("com.jayway.jsonpath:json-path") > >> * implementation("com.yahoo.datasketches:sketches-core") > >> * implementation("commons-codec:commons-codec") > >> * implementation("net.hydromatic:aggdesigner-algorithm") > >> * implementation("org.apache.commons:commons-dbcp2") > >> * implementation("org.apache.commons:commons-lang3") > >> * implementation("commons-io:commons-io") > >> * implementation("org.codehaus.janino:commons-compiler") > >> * implementation("org.codehaus.janino:janino") > >> > >> A few libraries are used only for a narrow range of functionality: > >> * esri-geometry and uzaygezen-core are used by geospatial functions; > >> * sketches-core is used by the HLL aggregate functions; > >> * json-path is used by some JSON functions; > >> * jackson-core, jackson-databind, jackson-dataformat-yaml are used to > >> load models, and to serialize RelNodes to and from JSON; > >> * commons-lang3, commons-codec, commons-io are probably only used in one > >> or two places each; > >> * aggdesigner-algotihm is used for recommending materialized views. > >> > >> So, the easiest way to reduce dependencies would be to make certain > >> classes of SQL functions optional (i.e. move them out of core). > >> > >> Julian > >> > >> > >> > >>>> On Dec 29, 2021, at 1:30 PM, Jacques Nadeau <[email protected]> > wrote: > >>> > >>> WRT SBOM (Julian): My general experience is that most large orgs use > >>> scanners now (either open or closed) and they will scan whether you > have > >> a > >>> bill of materials or not. I wouldn't worry about adding something > >>> additional. > >>> > >>> WRT too many dependencies (Gunnar): I completely agree with the general > >>> feeling of too many (and with Guava, jackson less so). I think the core > >>> challenge (no pun intended) is that calcite-core is really a lot of > >>> different components. For example, I have frequently wished that > parser, > >>> planner and enumerable were separate modules. And if they were, I'd > guess > >>> that each would have a narrower dependency range. I've also wished many > >>> times that runtime compilation was an optional addon as opposed to > >>> required/coupled in the core... > >>> > >>> When I've thought about how to dissect in the past, I think the big > >>> challenge would be tests, where things are sometimes mixed together. > >>> Breaking change possibilities could be at least somewhat mitigated by > >>> moving classes but not packages. > >>> > >>> On Wed, Dec 29, 2021 at 1:51 AM Gunnar Morling > >>> <[email protected]> wrote: > >>> > >>>> Hi, > >>>> > >>>> In a way, Calcite's build configuration as well as the published POM > >> could > >>>> be considered as such an SBOM? In particular when looking at the > latter > >>>> through services like mvnrepository [1], you get quite a good view on > >> the > >>>> dependency versions, licenses, any potential CVEs, etc. I think this > >> should > >>>> satisfy most user needs around this? Or are you referring to the > notion > >> of > >>>> Maven BOM POMs specifically [2], i.e. the notion of publishing a POM > >> with > >>>> all the Calcite component versions which people can then use with > >> Maven's > >>>> import scope (there should be something comparable for Gradle)? If so, > >> that > >>>> could be useful for users working with multiple Calcite components, > >> though > >>>> I think the usability improvement provided by such BOM POM wouldn't be > >>>> huge. > >>>> > >>>> I wanted to bring up a related matter though. Coming to Calcite as a > >> user > >>>> just recently (loving the possibilities it provides!), I was surprised > >> by > >>>> the large number of dependencies of the project. It looks like 1.29 > >>>> improves that a little bit (no more kotlin-stdlib, no more transitive > >>>> dependency to log4j 1.x), but the transitive hull of all dependencies > of > >>>> calcite-core still is quite big. I lack insight about what the > different > >>>> dependencies are used for; but as an application developer, Guava for > >>>> instance is a dependency which I'd prefer to not get pushed onto the > >>>> classpath transitively. Jackson is another heavy one; depending on how > >> it's > >>>> used, perhaps this could be pushed into some separate module which > users > >>>> could optionally pull in? That'd help to avoid having it around when > >> users > >>>> work with other JSON libs themselves and don't require JSON support in > >>>> Calcite. > >>>> > >>>> From a supply chain perspective, the less transitive dependencies a > >> library > >>>> like Calcite introduces to my project, the better IMHO. Less potential > >> for > >>>> version conflicts with my own (or other transitive) dependencies, and > >> also > >>>> less potential for introducing CVEs to the dependency graph, as e.g. > in > >> the > >>>> case of the Guava version currently used by Calcite; I suppose it does > >> not > >>>> impact the usage in Calcite, but these things tend to be tricky to > >> reason > >>>> about, and typical CVE reporting tooling will now create a warning > for a > >>>> project using Calcite, no matter whether that specific issue actually > >> is a > >>>> problem or not. > >>>> > >>>> Best, > >>>> > >>>> --Gunnar > >>>> > >>>> [1] > >>>> > >> > https://mvnrepository.com/artifact/org.apache.calcite/calcite-core/1.29.0 > >>>> [2] > >>>> > >>>> > >> > https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#bill-of-materials-bom-poms > >>>> > >>>> > >>>> Am Mi., 29. Dez. 2021 um 02:27 Uhr schrieb Julian Hyde < > >>>> [email protected]>: > >>>> > >>>>> In the wake of the log4j CVEs [1], people are asking how to improve > the > >>>>> security of open source projects, and one idea is to provide a SBOM > >>>>> (Software Bill of Materials) [2] along with each release. > >>>>> > >>>>> I had not heard of SBOM until a couple of days ago. Is anyone on this > >>>> list > >>>>> familiar with SBOMs and their use? Should Calcite be providing an > SBOM? > >>>> Are > >>>>> people aware of SBOM initiatives in other projects? What, in your > >>>> opinion, > >>>>> is the priority of this issue? > >>>>> > >>>>> Julian > >>>>> > >>>>> [1] > >>>>> > >>>> > >> > https://thehackernews.com/2021/12/second-log4j-vulnerability-cve-2021.html > >>>>> > >>>>> [2] https://en.wikipedia.org/wiki/Software_bill_of_materials > >>>>> > >>>> > >> > >> >
