Please log a jira case for the commons-lang3 change. It looks good. One or two 
places I’d create a function rather than having a blob of code inline.

Your use of default locale in the CSV adapter looks wrong. Calcite is a server, 
so never uses default locale or time zone. In fact we use forbiddenApis to 
check, so we should add a few methods to its configuration. 

Julian 

> On Jan 3, 2022, at 12:30 PM, Gunnar Morling 
> <[email protected]> wrote:
> 
> Hi,
> 
> Thanks a lot for this, I think trimming down the dependencies of Calcite
> will be of great help for its adoption.
> 
>> So, the easiest way to reduce dependencies would be to make certain
> classes of SQL functions optional (i.e. move them out of core).
> 
> That sounds like a good idea.
> 
>> commons-lang3, commons-codec, commons-io are probably only used in one or
> two places each;
> 
> To make some progress there, I've created PR
> https://github.com/apache/calcite/pull/2672 which removes the dependency to
> commons-lang3 from the entire code base. Any feedback on that PR would
> be appreciated (I still need to log an issue, but wanted to share quickly
> what I had). I can try and take a look at the other ones, if there's
> interest in this.
> 
> Re Janino, is there any reason for not using the compiler implementation
> coming with the JDK? Alternatively, one could also consider to generate
> byte code directly using ASM, which wouldn't be beneficial dependency-wise,
> but it may improve the performance of this generation step (I still lack
> insight why this is done in the first place).
> 
> Thanks,
> 
> --Gunnar
> 
>> Am Fr., 31. Dez. 2021 um 00:56 Uhr schrieb Julian Hyde <
>> [email protected]>:
>> 
>> Regarding dependencies. Here are the runtime dependencies from
>> core/build.gradle.kts (ignoring test and annotation libraries):
>> 
>> * api("com.esri.geometry:esri-geometry-api")
>> * api("com.fasterxml.jackson.core:jackson-annotations")
>> * api("com.google.guava:guava")
>> * api("org.apache.calcite.avatica:avatica-core")
>> * api("org.slf4j:slf4j-api")
>> * implementation("com.fasterxml.jackson.core:jackson-core")
>> * implementation("com.fasterxml.jackson.core:jackson-databind")
>> *
>> implementation("com.fasterxml.jackson.dataformat:jackson-dataformat-yaml")
>> * implementation("com.google.uzaygezen:uzaygezen-core")
>> * implementation("com.jayway.jsonpath:json-path")
>> * implementation("com.yahoo.datasketches:sketches-core")
>> * implementation("commons-codec:commons-codec")
>> * implementation("net.hydromatic:aggdesigner-algorithm")
>> * implementation("org.apache.commons:commons-dbcp2")
>> * implementation("org.apache.commons:commons-lang3")
>> * implementation("commons-io:commons-io")
>> * implementation("org.codehaus.janino:commons-compiler")
>> * implementation("org.codehaus.janino:janino")
>> 
>> A few libraries are used only for a narrow range of functionality:
>> * esri-geometry and uzaygezen-core are used by geospatial functions;
>> * sketches-core is used by the HLL aggregate functions;
>> * json-path is used by some JSON functions;
>> * jackson-core, jackson-databind, jackson-dataformat-yaml are used to
>> load models, and to serialize RelNodes to and from JSON;
>> * commons-lang3, commons-codec, commons-io are probably only used in one
>> or two places each;
>> * aggdesigner-algotihm is used for recommending materialized views.
>> 
>> So, the easiest way to reduce dependencies would be to make certain
>> classes of SQL functions optional (i.e. move them out of core).
>> 
>> Julian
>> 
>> 
>> 
>>>> On Dec 29, 2021, at 1:30 PM, Jacques Nadeau <[email protected]> wrote:
>>> 
>>> WRT SBOM (Julian): My general experience is that most large orgs use
>>> scanners now (either open or closed) and they will scan whether you have
>> a
>>> bill of materials or not. I wouldn't worry about adding something
>>> additional.
>>> 
>>> WRT too many dependencies (Gunnar): I completely agree with the general
>>> feeling of too many (and with Guava, jackson less so). I think the core
>>> challenge (no pun intended) is that calcite-core is really a lot of
>>> different components. For example, I have frequently wished that parser,
>>> planner and enumerable were separate modules. And if they were, I'd guess
>>> that each would have a narrower dependency range. I've also wished many
>>> times that runtime compilation was an optional addon as opposed to
>>> required/coupled in the core...
>>> 
>>> When I've thought about how to dissect in the past, I think the big
>>> challenge would be tests, where things are sometimes mixed together.
>>> Breaking change possibilities could be at least somewhat mitigated by
>>> moving classes but not packages.
>>> 
>>> On Wed, Dec 29, 2021 at 1:51 AM Gunnar Morling
>>> <[email protected]> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> In a way, Calcite's build configuration as well as the published POM
>> could
>>>> be considered as such an SBOM? In particular when looking at the latter
>>>> through services like mvnrepository [1], you get quite a good view on
>> the
>>>> dependency versions, licenses, any potential CVEs, etc. I think this
>> should
>>>> satisfy most user needs around this? Or are you referring to the notion
>> of
>>>> Maven BOM POMs specifically [2], i.e. the notion of publishing a POM
>> with
>>>> all the Calcite component versions which people can then use with
>> Maven's
>>>> import scope (there should be something comparable for Gradle)? If so,
>> that
>>>> could be useful for users working with multiple Calcite components,
>> though
>>>> I think the usability improvement provided by such BOM POM wouldn't be
>>>> huge.
>>>> 
>>>> I wanted to bring up a related matter though. Coming to Calcite as a
>> user
>>>> just recently (loving the possibilities it provides!), I was surprised
>> by
>>>> the large number of dependencies of the project. It looks like 1.29
>>>> improves that a little bit (no more kotlin-stdlib, no more transitive
>>>> dependency to log4j 1.x), but the transitive hull of all dependencies of
>>>> calcite-core still is quite big. I lack insight about what the different
>>>> dependencies are used for; but as an application developer, Guava for
>>>> instance is a dependency which I'd prefer to not get pushed onto the
>>>> classpath transitively. Jackson is another heavy one; depending on how
>> it's
>>>> used, perhaps this could be pushed into some separate module which users
>>>> could optionally  pull in? That'd help to avoid having it around when
>> users
>>>> work with other JSON libs themselves and don't require JSON support in
>>>> Calcite.
>>>> 
>>>> From a supply chain perspective, the less transitive dependencies a
>> library
>>>> like Calcite introduces to my project, the better IMHO. Less potential
>> for
>>>> version conflicts with my own (or other transitive) dependencies, and
>> also
>>>> less potential for introducing CVEs to the dependency graph, as e.g. in
>> the
>>>> case of the Guava version currently used by Calcite; I suppose it does
>> not
>>>> impact the usage in Calcite, but these things tend to be tricky to
>> reason
>>>> about, and typical CVE reporting tooling will now create a warning for a
>>>> project using Calcite, no matter whether that specific issue actually
>> is a
>>>> problem or not.
>>>> 
>>>> Best,
>>>> 
>>>> --Gunnar
>>>> 
>>>> [1]
>>>> 
>> https://mvnrepository.com/artifact/org.apache.calcite/calcite-core/1.29.0
>>>> [2]
>>>> 
>>>> 
>> https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#bill-of-materials-bom-poms
>>>> 
>>>> 
>>>> Am Mi., 29. Dez. 2021 um 02:27 Uhr schrieb Julian Hyde <
>>>> [email protected]>:
>>>> 
>>>>> In the wake of the log4j CVEs [1], people are asking how to improve the
>>>>> security of open source projects, and one idea is to provide a SBOM
>>>>> (Software Bill of Materials) [2] along with each release.
>>>>> 
>>>>> I had not heard of SBOM until a couple of days ago. Is anyone on this
>>>> list
>>>>> familiar with SBOMs and their use? Should Calcite be providing an SBOM?
>>>> Are
>>>>> people aware of SBOM initiatives in other projects? What, in your
>>>> opinion,
>>>>> is the priority of this issue?
>>>>> 
>>>>> Julian
>>>>> 
>>>>> [1]
>>>>> 
>>>> 
>> https://thehackernews.com/2021/12/second-log4j-vulnerability-cve-2021.html
>>>>> 
>>>>> [2] https://en.wikipedia.org/wiki/Software_bill_of_materials
>>>>> 
>>>> 
>> 
>> 

Reply via email to