Re: Introducing a DI framework in Hive?

2023-04-20 Thread Attila Turoczy
Cool! Can't wait the first DI specific commit and the review :)

On 2023. Apr 19., Wed at 14:24, Stamatis Zampetakis 
wrote:

> I think we all agree that DI can be beneficial in general.
>
> However, it's hard to say yes or no on something before having a
> concrete case to discuss; it doesn't have to be a PR but we need to
> work on a specific Hive use-case and list advantages/disadvantages of
> the proposal.
>
> Best,
> Stamatis
>
> On Mon, Apr 17, 2023 at 7:33 PM Laszlo Vegh 
> wrote:
> >
> > Hi all,
> >
> > Sorry for not answering for so far, for some reason I did not receive
> your answers in my gmail account. I’m happy to see that there’s a
> conversation around the topic, so let me add my opinion on your points.
> >
> > First of all, introducing a DI framework does not mean a large scale
> refactoring. A suitable module, or a well-bounded set of components can be
> chosen as the first candidate. It’s also important that nobody will be
> forced to utilise the DI container when writing features, or to redesign
> existing code when it is being touched.
> > As for the aim: I’ve worked quite a lot with Java and .Net DI
> frameworks, and my experience was that having a DI framework greatly
> reduces the effort to write well organised and maintainable code. While
> well organised code can be written without DI frameworks too, the lack of
> such framework makes it much more easier to write poorly designed code (bad
> scoping, lifecycle issues, visibility issues, etc). On well-organised I
> mean:
> > Design patterns: DI containers make it easier to write code using the
> well known design patterns. For example you can implement factory, wrapper,
> adapter, etc patterns by simply using the offered features as it is
> supposed to do.
> > Streamlined component initialisation: No more spaghetti/boilerplate
> component init methods
> > Well defined component scopes (lifecycle): DI frameworks support various
> component scopes, which offers a fine grained control over component
> lifecylce -> Singleton, one component per thread, one component per request
> from DI container, etc.
> > Organised and visible component/class dependencies: Through constructor
> injection all the dependencies of a class are visible (unlike static method
> calls). Using this approach it is impossible to create circular
> dependencies which lead to object initialisation issues and hacks. By
> requiring all deps during object creation it’s way easier to detect or
> avoid unwanted dependencies. It also makes easier to better organise the
> code into packages and modules
> > Enhanced testability: I have explained this earlier.
> > Well defined component visibility: No need for “union-all” context
> objects. Instead of having context objects with references for all of the
> components which may required during the execution, each execution step can
> obtain the necessary dependencies from the DI container. Also, no more
> public static methods, or class instances. In order to let some component
> accessible from everywhere, there’s no need to make it public and static.
> DI frameworks also offer nested/sub contexts to limit/control visibility.
> > My original mail was supposed to be a kickoff, to start talking about
> DI. Before creating a PR with an example in Hive, I would like to have a
> common agreement that we want to do this, and there is no blocker which
> prevents us from doing it. Once we have this agreement I can create a
> working example and demonstrate how it will help us in the future.
> > Regarding the stability and performance issues: Of course those must be
> addressed as well, but as Stamatis pointed out, Hive is an open source
> project and everybody can have its own initiative in parallel to the
> others’.
> >
> > In Java I have the most experience with Spring, so I would prefer
> choosing it. It became huge by now, but it’s modular. We are not forced to
> use all of the offered features, if we want a pure DI container with some
> basic extensions, we would only need spring-core, spring-beans, and
> spring-context. It has several extensions and supports tons of other well
> known frameworks and/or technologies.
> >
> > Best regards,
> > Laszlo Vegh
>


Re: Introducing a DI framework in Hive?

2023-04-19 Thread Stamatis Zampetakis
I think we all agree that DI can be beneficial in general.

However, it's hard to say yes or no on something before having a
concrete case to discuss; it doesn't have to be a PR but we need to
work on a specific Hive use-case and list advantages/disadvantages of
the proposal.

Best,
Stamatis

On Mon, Apr 17, 2023 at 7:33 PM Laszlo Vegh  wrote:
>
> Hi all,
>
> Sorry for not answering for so far, for some reason I did not receive your 
> answers in my gmail account. I’m happy to see that there’s a conversation 
> around the topic, so let me add my opinion on your points.
>
> First of all, introducing a DI framework does not mean a large scale 
> refactoring. A suitable module, or a well-bounded set of components can be 
> chosen as the first candidate. It’s also important that nobody will be forced 
> to utilise the DI container when writing features, or to redesign existing 
> code when it is being touched.
> As for the aim: I’ve worked quite a lot with Java and .Net DI frameworks, and 
> my experience was that having a DI framework greatly reduces the effort to 
> write well organised and maintainable code. While well organised code can be 
> written without DI frameworks too, the lack of such framework makes it much 
> more easier to write poorly designed code (bad scoping, lifecycle issues, 
> visibility issues, etc). On well-organised I mean:
> Design patterns: DI containers make it easier to write code using the well 
> known design patterns. For example you can implement factory, wrapper, 
> adapter, etc patterns by simply using the offered features as it is supposed 
> to do.
> Streamlined component initialisation: No more spaghetti/boilerplate component 
> init methods
> Well defined component scopes (lifecycle): DI frameworks support various 
> component scopes, which offers a fine grained control over component 
> lifecylce -> Singleton, one component per thread, one component per request 
> from DI container, etc.
> Organised and visible component/class dependencies: Through constructor 
> injection all the dependencies of a class are visible (unlike static method 
> calls). Using this approach it is impossible to create circular dependencies 
> which lead to object initialisation issues and hacks. By requiring all deps 
> during object creation it’s way easier to detect or avoid unwanted 
> dependencies. It also makes easier to better organise the code into packages 
> and modules
> Enhanced testability: I have explained this earlier.
> Well defined component visibility: No need for “union-all” context objects. 
> Instead of having context objects with references for all of the components 
> which may required during the execution, each execution step can obtain the 
> necessary dependencies from the DI container. Also, no more public static 
> methods, or class instances. In order to let some component accessible from 
> everywhere, there’s no need to make it public and static. DI frameworks also 
> offer nested/sub contexts to limit/control visibility.
> My original mail was supposed to be a kickoff, to start talking about DI. 
> Before creating a PR with an example in Hive, I would like to have a common 
> agreement that we want to do this, and there is no blocker which prevents us 
> from doing it. Once we have this agreement I can create a working example and 
> demonstrate how it will help us in the future.
> Regarding the stability and performance issues: Of course those must be 
> addressed as well, but as Stamatis pointed out, Hive is an open source 
> project and everybody can have its own initiative in parallel to the others’.
>
> In Java I have the most experience with Spring, so I would prefer choosing 
> it. It became huge by now, but it’s modular. We are not forced to use all of 
> the offered features, if we want a pure DI container with some basic 
> extensions, we would only need spring-core, spring-beans, and spring-context. 
> It has several extensions and supports tons of other well known frameworks 
> and/or technologies.
>
> Best regards,
> Laszlo Vegh


RE: Introducing a DI framework in Hive?

2023-04-17 Thread Laszlo Vegh
Hi all, 

Sorry for not answering for so far, for some reason I did not receive your 
answers in my gmail account. I’m happy to see that there’s a conversation 
around the topic, so let me add my opinion on your points.

First of all, introducing a DI framework does not mean a large scale 
refactoring. A suitable module, or a well-bounded set of components can be 
chosen as the first candidate. It’s also important that nobody will be forced 
to utilise the DI container when writing features, or to redesign existing code 
when it is being touched.
As for the aim: I’ve worked quite a lot with Java and .Net DI frameworks, and 
my experience was that having a DI framework greatly reduces the effort to 
write well organised and maintainable code. While well organised code can be 
written without DI frameworks too, the lack of such framework makes it much 
more easier to write poorly designed code (bad scoping, lifecycle issues, 
visibility issues, etc). On well-organised I mean:
Design patterns: DI containers make it easier to write code using the well 
known design patterns. For example you can implement factory, wrapper, adapter, 
etc patterns by simply using the offered features as it is supposed to do.
Streamlined component initialisation: No more spaghetti/boilerplate component 
init methods
Well defined component scopes (lifecycle): DI frameworks support various 
component scopes, which offers a fine grained control over component lifecylce 
-> Singleton, one component per thread, one component per request from DI 
container, etc.
Organised and visible component/class dependencies: Through constructor 
injection all the dependencies of a class are visible (unlike static method 
calls). Using this approach it is impossible to create circular dependencies 
which lead to object initialisation issues and hacks. By requiring all deps 
during object creation it’s way easier to detect or avoid unwanted 
dependencies. It also makes easier to better organise the code into packages 
and modules
Enhanced testability: I have explained this earlier.
Well defined component visibility: No need for “union-all” context objects. 
Instead of having context objects with references for all of the components 
which may required during the execution, each execution step can obtain the 
necessary dependencies from the DI container. Also, no more public static 
methods, or class instances. In order to let some component accessible from 
everywhere, there’s no need to make it public and static. DI frameworks also 
offer nested/sub contexts to limit/control visibility.
My original mail was supposed to be a kickoff, to start talking about DI. 
Before creating a PR with an example in Hive, I would like to have a common 
agreement that we want to do this, and there is no blocker which prevents us 
from doing it. Once we have this agreement I can create a working example and 
demonstrate how it will help us in the future.
Regarding the stability and performance issues: Of course those must be 
addressed as well, but as Stamatis pointed out, Hive is an open source project 
and everybody can have its own initiative in parallel to the others’.

In Java I have the most experience with Spring, so I would prefer choosing it. 
It became huge by now, but it’s modular. We are not forced to use all of the 
offered features, if we want a pure DI container with some basic extensions, we 
would only need spring-core, spring-beans, and spring-context. It has several 
extensions and supports tons of other well known frameworks and/or technologies.

Best regards,
Laszlo Vegh

Re: Introducing a DI framework in Hive?

2023-04-13 Thread László Bodor
Thanks Sungwoo. Regarding the correctness issue, can you post it to the
proper thread? I guess it was not intentional to post here.

Regards,
Laszlo Bodor


Sungwoo Park  ezt írta (időpont: 2023. ápr. 13., Cs,
12:39):

> I would like to add another question to the list of Laszlo.
>
> 4) When a specific DI framework is chosen, what kinds of new dependencies
> will be introduced? (Are they conflicting with existing dependencies of
> Hive?)
>
> Regards,
>
> --- Sungwoo Park
>
>
> On Thu, Apr 13, 2023 at 4:43 PM László Bodor 
> wrote:
>
> > Thanks, guys for putting DI into scope, sounds very interesting, just a
> > couple of questions to help me understand and move this forward (and
> maybe
> > involve more folks with DI experience):
> >
> > 1) Can we have some examples, even with dummy code snippet-level, about
> > what we want to achieve? I mean, "utility classes with static methods are
> > bd" is not an example, even if I agree to a certain extent.
> > 2) Yes, DI helps with testing, but the question is, whether injecting
> will
> > happen only in tests or in production parts as well.
> > 3) What's the primary thing/object in your mind when it comes to
> injecting
> > something in the scope of Hive?
> >
> > TLDR: I remember an earlier experience with Spring when
> > it @InjectedWhateverIWantedWithAwesomeAnnotations, that's what I need to
> > see examples for in case of hive.
> >
> > Regards,
> > Laszlo Bodor
> >
> >
> >
> > Stamatis Zampetakis  ezt írta (időpont: 2023. ápr.
> 13.,
> > Cs, 9:33):
> >
> > > Just to be clear, I am in favor of introducing DI frameworks in Hive
> > > where it makes sense. As Attila said, we don't want to get stuck with
> > > legacy code forever. When a concrete proposal comes up we can discuss
> > > benefits vs drawbacks.
> > >
> > > Regarding stability I agree it is a pressing issue but Hive is an open
> > > source project and we certainly don't want to force volunteers to work
> > > on specific things or forbid them to work on others. Contributing to
> > > open source is supposed to be a fun and rewarding experience. I am
> > > sure many of the people in this list have stability as a primary goal
> > > so eventually we will get there.
> > >
> > > Best,
> > > Stamatis
> > >
> >
>


Re: Introducing a DI framework in Hive?

2023-04-13 Thread Sungwoo Park
I would like to add another question to the list of Laszlo.

4) When a specific DI framework is chosen, what kinds of new dependencies
will be introduced? (Are they conflicting with existing dependencies of
Hive?)

Regards,

--- Sungwoo Park


On Thu, Apr 13, 2023 at 4:43 PM László Bodor 
wrote:

> Thanks, guys for putting DI into scope, sounds very interesting, just a
> couple of questions to help me understand and move this forward (and maybe
> involve more folks with DI experience):
>
> 1) Can we have some examples, even with dummy code snippet-level, about
> what we want to achieve? I mean, "utility classes with static methods are
> bd" is not an example, even if I agree to a certain extent.
> 2) Yes, DI helps with testing, but the question is, whether injecting will
> happen only in tests or in production parts as well.
> 3) What's the primary thing/object in your mind when it comes to injecting
> something in the scope of Hive?
>
> TLDR: I remember an earlier experience with Spring when
> it @InjectedWhateverIWantedWithAwesomeAnnotations, that's what I need to
> see examples for in case of hive.
>
> Regards,
> Laszlo Bodor
>
>
>
> Stamatis Zampetakis  ezt írta (időpont: 2023. ápr. 13.,
> Cs, 9:33):
>
> > Just to be clear, I am in favor of introducing DI frameworks in Hive
> > where it makes sense. As Attila said, we don't want to get stuck with
> > legacy code forever. When a concrete proposal comes up we can discuss
> > benefits vs drawbacks.
> >
> > Regarding stability I agree it is a pressing issue but Hive is an open
> > source project and we certainly don't want to force volunteers to work
> > on specific things or forbid them to work on others. Contributing to
> > open source is supposed to be a fun and rewarding experience. I am
> > sure many of the people in this list have stability as a primary goal
> > so eventually we will get there.
> >
> > Best,
> > Stamatis
> >
>


Re: Introducing a DI framework in Hive?

2023-04-13 Thread Sungwoo Park
Hi  Stamatis,

For the correctness issue, we wanted to solve the problem ourselves and
have made a few pull requests in [1] so far. (We would like to  kindly
request Hive committers to review the pull requests.) For HIVE-27226, we
are working on a solution and will create a pull request when a solution is
ready. For the stability issue, we have not made much progress, but when
initial results become available, let me report in this mailing list.

Regards,

--- Sungwoo


On Thu, Apr 13, 2023 at 4:33 PM Stamatis Zampetakis 
wrote:

> Just to be clear, I am in favor of introducing DI frameworks in Hive
> where it makes sense. As Attila said, we don't want to get stuck with
> legacy code forever. When a concrete proposal comes up we can discuss
> benefits vs drawbacks.
>
> Regarding stability I agree it is a pressing issue but Hive is an open
> source project and we certainly don't want to force volunteers to work
> on specific things or forbid them to work on others. Contributing to
> open source is supposed to be a fun and rewarding experience. I am
> sure many of the people in this list have stability as a primary goal
> so eventually we will get there.
>
> Best,
> Stamatis
>


Re: Introducing a DI framework in Hive?

2023-04-13 Thread László Bodor
Thanks, guys for putting DI into scope, sounds very interesting, just a
couple of questions to help me understand and move this forward (and maybe
involve more folks with DI experience):

1) Can we have some examples, even with dummy code snippet-level, about
what we want to achieve? I mean, "utility classes with static methods are
bd" is not an example, even if I agree to a certain extent.
2) Yes, DI helps with testing, but the question is, whether injecting will
happen only in tests or in production parts as well.
3) What's the primary thing/object in your mind when it comes to injecting
something in the scope of Hive?

TLDR: I remember an earlier experience with Spring when
it @InjectedWhateverIWantedWithAwesomeAnnotations, that's what I need to
see examples for in case of hive.

Regards,
Laszlo Bodor



Stamatis Zampetakis  ezt írta (időpont: 2023. ápr. 13.,
Cs, 9:33):

> Just to be clear, I am in favor of introducing DI frameworks in Hive
> where it makes sense. As Attila said, we don't want to get stuck with
> legacy code forever. When a concrete proposal comes up we can discuss
> benefits vs drawbacks.
>
> Regarding stability I agree it is a pressing issue but Hive is an open
> source project and we certainly don't want to force volunteers to work
> on specific things or forbid them to work on others. Contributing to
> open source is supposed to be a fun and rewarding experience. I am
> sure many of the people in this list have stability as a primary goal
> so eventually we will get there.
>
> Best,
> Stamatis
>


Re: Introducing a DI framework in Hive?

2023-04-13 Thread Stamatis Zampetakis
Just to be clear, I am in favor of introducing DI frameworks in Hive
where it makes sense. As Attila said, we don't want to get stuck with
legacy code forever. When a concrete proposal comes up we can discuss
benefits vs drawbacks.

Regarding stability I agree it is a pressing issue but Hive is an open
source project and we certainly don't want to force volunteers to work
on specific things or forbid them to work on others. Contributing to
open source is supposed to be a fun and rewarding experience. I am
sure many of the people in this list have stability as a primary goal
so eventually we will get there.

Best,
Stamatis


Re: Introducing a DI framework in Hive?

2023-04-12 Thread Attila Turoczy
Hi Stamatis and Sungwoo,

Agree with several points. Hive has millions of LOC which is here and will
be with us in the same way, it is not a question. But we need to think
about the future of the project. There are no engineers in the world who
want to use old and legacy technologies, every engineer wants to use cool
staff where He/She can learn new stuff, patterns, designs. If we do not
improve on our codebase that will be a legacy zombieland, which won't be
touched by love and passion. *(Oh what a management bullshit - you can tell
:) )* But I truly think that if we introduce new principals it could give
us speed, motivation, and power to continue the innovation. As an engineer
I always want to use a modern approach, because this gives me more
excitement, I think that introducing a DI for this type of project is hard,
challenging and gives excitement. I want to live in a world where Hive is
the leader of the new principals, stable and easy to use, also the
on-boarding experience would be much much faster and easier.

I don't wanna live in a world 

As you wrote, the DI is powerful, and the hive does not contain it because
it became more widely used after the hive has started. If we / you
introduce it, it does not mean we have to refactor every module with DI.
But we can try to identify some components where we would introduce it,
also we could create a docs for others on how to use and implement it.
Maybe just 1-2 components, others will come later as we touch it, if it
does make sense. We won't remove every static utils class, because it would
not make sense, but with baby steps we could try to introduce, and for new
development we could introduce a loosely coupled standard, where every
dependency is more lightweight and also it would be easier to test these
components. (Which -could-  improves the quality as well)


#2 The quality of the 3.1.x vs 4.0.x is a bit different topic. I don't
think it has too many connections to the DI, but I think we should talk
about the root causes on different threads. You had several good points. We
- ALL - of us should be more careful about this type of issue. It was the
same in the past, especially when the hive 3 introduced there were several
similar issues. When new groundbreaking changes come to the repository it
could happen. Also I think the 4.0.0 alpha describes it as something that
is not solid stone. But anyhow you are right we have to be more careful!
But let's start a different thread about it


-Attila

On Wed, Apr 12, 2023 at 5:07 PM Sungwoo Park  wrote:

> Hello,
>
> I am not a committer, but I would like to add my opinion. At this stage of
> development, I think it is quite risky to switch to a DI framework for a
> couple of reasons.
>
> 1. A DI framework would have been a powerful tool if it had been
> incorporated into the project from the early stage. Now, however, Hive has
> way over 1 million lines of code and tens of thousands test cases, and my
> guess is that the overhead associated with introducing DI into Hive
> (whether gradually or globally at once) is very likely to outweigh the
> additional benefit, if any, of introducing DI, especially if we consider
> the stability of its development infrastructure.
>
> 2. Implementing new features, such as DI, in Hive can be an exciting
> sub-project and fun, but I think more pressing issues are to stabilize the
> current Hive code, although this is certainly less motivating and more
> boring. I hope that no new major features, such as DI, will be introduced
> until Hive becomes, say, as stable as Hive 3.1.
>
> For 2, I can give a few examples to substantiate my claim.
>
> 1) For the past few years, several new techniques for query compilation
> have been introduced. Unfortunately they were buggy and Hive started to
> return wrong results, on the assumption that Hive 3.1.2 was working
> correctly. (Yes, Hive 3.1.2 also has correctness bugs, but when tested
> against TPC-DS, Hive 3.1.2 returned the same results as other frameworks,
> so it can be used as a basis for comparison.) From our own testing, Hive
> 4.0.0-SNAPSHOT returns wrong results on several queries in TPC-DS, and this
> should be a major setback for Hive. If interested, please see [1] and [2].
>
> 2) Perhaps due to the same reason as in 1), Hive 4.0.0-SNAPSHOT is
> noticeably slower than Hive 3.1.2 on the TPC-DS benchmark. However, this is
> only from my own testing (using 10TB TPC-DS), and I hope that someone in
> the Hive team will try similar experiments to confirm/refute my claim.
>
> 3) Currently many q tests are run against MapReduce (which is not
> officially supported as far as I remember). However, some of these q tests
> fail when run against Tez. If Tez and LLAP are the new execution engines,
> these tests should be migrated as well.
>
> Sungwoo Park
>
> [1] https://issues.apache.org/jira/browse/HIVE-26654
> [2] https://issues.apache.org/jira/browse/HIVE-27226
>
> On Wed, Apr 12, 2023 at 10:12 PM 

Re: Introducing a DI framework in Hive?

2023-04-12 Thread Sungwoo Park
Hello,

I am not a committer, but I would like to add my opinion. At this stage of
development, I think it is quite risky to switch to a DI framework for a
couple of reasons.

1. A DI framework would have been a powerful tool if it had been
incorporated into the project from the early stage. Now, however, Hive has
way over 1 million lines of code and tens of thousands test cases, and my
guess is that the overhead associated with introducing DI into Hive
(whether gradually or globally at once) is very likely to outweigh the
additional benefit, if any, of introducing DI, especially if we consider
the stability of its development infrastructure.

2. Implementing new features, such as DI, in Hive can be an exciting
sub-project and fun, but I think more pressing issues are to stabilize the
current Hive code, although this is certainly less motivating and more
boring. I hope that no new major features, such as DI, will be introduced
until Hive becomes, say, as stable as Hive 3.1.

For 2, I can give a few examples to substantiate my claim.

1) For the past few years, several new techniques for query compilation
have been introduced. Unfortunately they were buggy and Hive started to
return wrong results, on the assumption that Hive 3.1.2 was working
correctly. (Yes, Hive 3.1.2 also has correctness bugs, but when tested
against TPC-DS, Hive 3.1.2 returned the same results as other frameworks,
so it can be used as a basis for comparison.) From our own testing, Hive
4.0.0-SNAPSHOT returns wrong results on several queries in TPC-DS, and this
should be a major setback for Hive. If interested, please see [1] and [2].

2) Perhaps due to the same reason as in 1), Hive 4.0.0-SNAPSHOT is
noticeably slower than Hive 3.1.2 on the TPC-DS benchmark. However, this is
only from my own testing (using 10TB TPC-DS), and I hope that someone in
the Hive team will try similar experiments to confirm/refute my claim.

3) Currently many q tests are run against MapReduce (which is not
officially supported as far as I remember). However, some of these q tests
fail when run against Tez. If Tez and LLAP are the new execution engines,
these tests should be migrated as well.

Sungwoo Park

[1] https://issues.apache.org/jira/browse/HIVE-26654
[2] https://issues.apache.org/jira/browse/HIVE-27226

On Wed, Apr 12, 2023 at 10:12 PM Stamatis Zampetakis 
wrote:

> Hey Laszlo,
>
> Dependency injection is a very powerful and useful tool/design pattern.
>
> I don't think there is a particular reason for which Hive does not use
> DI framework apart maybe from the fact that we have lots of legacy
> code that existed before DI became that popular.
>
> I am open to ideas and suggestions about parts of the code that we
> could improve via DI. I would probably avoid big refactorings to core
> components of Hive for the sake of introducing a DI framework but I
> see no big issue using such frameworks in new code. As usual when we
> are about to introduce a new dependency to the project we should be
> mindful of all the implications that this might have.
>
> It's hard to make a generally applicable claim that we should use this
> or that framework since I guess it has to do a lot with personal
> preferences; we tend to prefer things that we have already used. I
> haven't used DI frameworks that much so don't have a strong opinion on
> which framework is the best so I am willing to follow the majority.
>
> Best,
> Stamatis
>
> On Tue, Apr 4, 2023 at 1:19 PM Laszlo Vegh 
> wrote:
> >
> >
> > Hi all,
> >
> > I would like to start a conversation about introducing some Dependency
> Injection framework (like Spring, Guice, Weld, etc.) in Hive.
> >
> > IMHO the lack of such framework makes the codebase way less organised,
> and harder to maintain. Moreover, I think it also lead to introducing a
> huge amount of static/utility methods and classes (which is highly
> discouraged when using DI frameworks). When there is no DI framework,
> utility classes with static methods often seem to be the simplest and best
> way to share code across different Hive components/classes, but these
> constructs are really killing testability. For example it is much harder to
> mock static method calls, than mocking service/component instances. Poor
> testability is a major issue on its own, but having a DI framework could
> have much more benefit, like greater flexibility (modularity), better
> organised services, etc.
> >
> >
> > I’m interested if there’s any reason why there is no DI in Hive so far.
> I know there’s no way to introduce it everywhere in a single step, but we
> could start using it where it is easy to start, and continuously expand its
> usage from class to class. If there is no strong reason why no to do it, I
> would like to start an open conversation around this topic. (Possible
> benefits, drawbacks, which framework to use, where to introduce it first,
> etc.)
> >
> > If anybody is interested in this initiative, please join the
> conversation, and add your thoughts, 

Re: Introducing a DI framework in Hive?

2023-04-12 Thread Stamatis Zampetakis
Hey Laszlo,

Dependency injection is a very powerful and useful tool/design pattern.

I don't think there is a particular reason for which Hive does not use
DI framework apart maybe from the fact that we have lots of legacy
code that existed before DI became that popular.

I am open to ideas and suggestions about parts of the code that we
could improve via DI. I would probably avoid big refactorings to core
components of Hive for the sake of introducing a DI framework but I
see no big issue using such frameworks in new code. As usual when we
are about to introduce a new dependency to the project we should be
mindful of all the implications that this might have.

It's hard to make a generally applicable claim that we should use this
or that framework since I guess it has to do a lot with personal
preferences; we tend to prefer things that we have already used. I
haven't used DI frameworks that much so don't have a strong opinion on
which framework is the best so I am willing to follow the majority.

Best,
Stamatis

On Tue, Apr 4, 2023 at 1:19 PM Laszlo Vegh  wrote:
>
>
> Hi all,
>
> I would like to start a conversation about introducing some Dependency 
> Injection framework (like Spring, Guice, Weld, etc.) in Hive.
>
> IMHO the lack of such framework makes the codebase way less organised, and 
> harder to maintain. Moreover, I think it also lead to introducing a huge 
> amount of static/utility methods and classes (which is highly discouraged 
> when using DI frameworks). When there is no DI framework, utility classes 
> with static methods often seem to be the simplest and best way to share code 
> across different Hive components/classes, but these constructs are really 
> killing testability. For example it is much harder to mock static method 
> calls, than mocking service/component instances. Poor testability is a major 
> issue on its own, but having a DI framework could have much more benefit, 
> like greater flexibility (modularity), better organised services, etc.
>
>
> I’m interested if there’s any reason why there is no DI in Hive so far. I 
> know there’s no way to introduce it everywhere in a single step, but we could 
> start using it where it is easy to start, and continuously expand its usage 
> from class to class. If there is no strong reason why no to do it, I would 
> like to start an open conversation around this topic. (Possible benefits, 
> drawbacks, which framework to use, where to introduce it first, etc.)
>
> If anybody is interested in this initiative, please join the conversation, 
> and add your thoughts, ideas, doubts, anything.
>
> Thanks,
>
> Laszlo Vegh
> veghlac...@gmail.com