Re: Adding flink-scala as a dependency to flink-streaming-core

2015-06-20 Thread Robert Metzger
I like option 1 the most (move to flink-core), however, it would scatter
the type extractor / type information classes accross multiple projects.
Why are we not moving the entire type extractor system into flink-core?

There were some thoughts in the past to restructure the maven module
layout: https://issues.apache.org/jira/browse/FLINK-1712 .. I think the
change would be in line with the move.


On Wed, Jun 10, 2015 at 7:52 AM, Gábor Gévay gga...@gmail.com wrote:

 I see four options to solve this without adding the dependency:
 1. Move CaseClassTypeInfo and CaseClassComparator to flink-core. Till
 said that we want to avoid mixed Scala and Java modules, which rules
 this out.
 2. Create a new toplevel maven project scala-core, and move things there.
 3. Hacky solution: What I actually need is just to determine if a
 TypeInfo instance is a CaseClassTypeInfo. This could be achieved by
 adding an isCaseClassTypeInfo() method to TupleTypeInfoBase returning
 false, and override it in CaseClassTypeInfo to return true.
 4. Reimplement CaseClassTypeInfo and CaseClassComparator in Java. I
 hope that this can be avoided.

 I think the decision boils down to whether similar problems will
 probably arise in the future. If this is a unique issue, then 3. is
 fine. Otherwise it won't be possible to avoid doing 2. later anyway.
 Maybe someone who knows Flink better than me can estimate how likely
 this is?

 Best regards,
 Gabor


 2015-06-10 15:55 GMT+02:00 Gábor Gévay gga...@gmail.com:
  it does not feel right to add an API package to a core package
 
  Yes, that makes sense. I just tried removing the flink-java dependency
  from flink-streaming to see what needs it, and it builds fine without
  it :)
 
  What do you think about the second option? (to move the Scala
  typeutils (or just CaseClassTypeInfo and CaseClassComparator) to
  flink-core, to where the Java typeutils are) This would be mixing
  scala code to an otherwise java module, but I don't know whether that
  is a problem.
 
  Cheers,
  Gabor
 
  2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org:
  Btw: I noticed that all streaming modules depend on flink-core,
  flink-runtime, flink-clients and flink-java. Is there a particular
 reason
  why the streaming connectors depend on flink-clients and flink-java?
 
  On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org
 wrote:
 
  I see the reason why you want to add flink-scala as a dependency to
  flink-streaming-core. However, it does not feel right to add an API
 package
  to a core package IMHO.
 
  But I noticed that flink-streaming-core also depends on flink-java.
 Which
  seems odd to me as well. I'm not a streaming expert and thus cannot
 tell
  much about the reasons why a core package has a dependency on an API
  package but for me this looks more like an indicator for a necessary
  restructuring of our packages. Maybe someone working on the streaming
 parts
  can chime in and shed some light on the required dependencies.
 
  Cheers,
  Till
 
 
  On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote:
 
  Hello,
 
  I would like to ask if it would be OK if I added flink-scala as a
  dependency to flink-streaming-core. An alternative would be to move
  the Scala typeutils to flink-core (to where the Java typeutils are).
  Why I need this:
 
  While I am implementing the fast median calculation for windows as
  part of my Google Summer of Code project, I am refactoring the way
  sum, min, max, etc. are accessing the user-specified field
  (https://github.com/apache/flink/pull/684). Currently both the logic
  of their aggregators are duplicated for the different kinds of types
  (tuple, pojo, array, Scala case class, simple), and also the field
  access logic is duplicated across the different aggregators. In my
  GSoC project I will implement some further methods (avg, variance,
  etc.) that take the same kind of parameters as sum, min, etc., so it
  will be neccassary to have the field access logic centralized (this is
  the FieldAccessor class in the PR). It would be convenient if this
  could also handle Scala case classes, for which CaseClassTypeInfo is
  needed which is currently in flink-scala.
 
  Best regards,
  Gabor
 
 



Re: Adding flink-scala as a dependency to flink-streaming-core

2015-06-10 Thread Aljoscha Krettek
If they can be easily converted to Java code, that would be a good solution.

On Wed, 10 Jun 2015 at 15:56 Gábor Gévay gga...@gmail.com wrote:

  it does not feel right to add an API package to a core package

 Yes, that makes sense. I just tried removing the flink-java dependency
 from flink-streaming to see what needs it, and it builds fine without
 it :)

 What do you think about the second option? (to move the Scala
 typeutils (or just CaseClassTypeInfo and CaseClassComparator) to
 flink-core, to where the Java typeutils are) This would be mixing
 scala code to an otherwise java module, but I don't know whether that
 is a problem.

 Cheers,
 Gabor

 2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org:
  Btw: I noticed that all streaming modules depend on flink-core,
  flink-runtime, flink-clients and flink-java. Is there a particular reason
  why the streaming connectors depend on flink-clients and flink-java?
 
  On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org
 wrote:
 
  I see the reason why you want to add flink-scala as a dependency to
  flink-streaming-core. However, it does not feel right to add an API
 package
  to a core package IMHO.
 
  But I noticed that flink-streaming-core also depends on flink-java.
 Which
  seems odd to me as well. I'm not a streaming expert and thus cannot tell
  much about the reasons why a core package has a dependency on an API
  package but for me this looks more like an indicator for a necessary
  restructuring of our packages. Maybe someone working on the streaming
 parts
  can chime in and shed some light on the required dependencies.
 
  Cheers,
  Till
 
 
  On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote:
 
  Hello,
 
  I would like to ask if it would be OK if I added flink-scala as a
  dependency to flink-streaming-core. An alternative would be to move
  the Scala typeutils to flink-core (to where the Java typeutils are).
  Why I need this:
 
  While I am implementing the fast median calculation for windows as
  part of my Google Summer of Code project, I am refactoring the way
  sum, min, max, etc. are accessing the user-specified field
  (https://github.com/apache/flink/pull/684). Currently both the logic
  of their aggregators are duplicated for the different kinds of types
  (tuple, pojo, array, Scala case class, simple), and also the field
  access logic is duplicated across the different aggregators. In my
  GSoC project I will implement some further methods (avg, variance,
  etc.) that take the same kind of parameters as sum, min, etc., so it
  will be neccassary to have the field access logic centralized (this is
  the FieldAccessor class in the PR). It would be convenient if this
  could also handle Scala case classes, for which CaseClassTypeInfo is
  needed which is currently in flink-scala.
 
  Best regards,
  Gabor
 
 



Re: Adding flink-scala as a dependency to flink-streaming-core

2015-06-10 Thread Till Rohrmann
Btw: I noticed that all streaming modules depend on flink-core,
flink-runtime, flink-clients and flink-java. Is there a particular reason
why the streaming connectors depend on flink-clients and flink-java?

On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote:

 I see the reason why you want to add flink-scala as a dependency to
 flink-streaming-core. However, it does not feel right to add an API package
 to a core package IMHO.

 But I noticed that flink-streaming-core also depends on flink-java. Which
 seems odd to me as well. I'm not a streaming expert and thus cannot tell
 much about the reasons why a core package has a dependency on an API
 package but for me this looks more like an indicator for a necessary
 restructuring of our packages. Maybe someone working on the streaming parts
 can chime in and shed some light on the required dependencies.

 Cheers,
 Till


 On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote:

 Hello,

 I would like to ask if it would be OK if I added flink-scala as a
 dependency to flink-streaming-core. An alternative would be to move
 the Scala typeutils to flink-core (to where the Java typeutils are).
 Why I need this:

 While I am implementing the fast median calculation for windows as
 part of my Google Summer of Code project, I am refactoring the way
 sum, min, max, etc. are accessing the user-specified field
 (https://github.com/apache/flink/pull/684). Currently both the logic
 of their aggregators are duplicated for the different kinds of types
 (tuple, pojo, array, Scala case class, simple), and also the field
 access logic is duplicated across the different aggregators. In my
 GSoC project I will implement some further methods (avg, variance,
 etc.) that take the same kind of parameters as sum, min, etc., so it
 will be neccassary to have the field access logic centralized (this is
 the FieldAccessor class in the PR). It would be convenient if this
 could also handle Scala case classes, for which CaseClassTypeInfo is
 needed which is currently in flink-scala.

 Best regards,
 Gabor




Re: Adding flink-scala as a dependency to flink-streaming-core

2015-06-10 Thread Gábor Gévay
 it does not feel right to add an API package to a core package

Yes, that makes sense. I just tried removing the flink-java dependency
from flink-streaming to see what needs it, and it builds fine without
it :)

What do you think about the second option? (to move the Scala
typeutils (or just CaseClassTypeInfo and CaseClassComparator) to
flink-core, to where the Java typeutils are) This would be mixing
scala code to an otherwise java module, but I don't know whether that
is a problem.

Cheers,
Gabor

2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org:
 Btw: I noticed that all streaming modules depend on flink-core,
 flink-runtime, flink-clients and flink-java. Is there a particular reason
 why the streaming connectors depend on flink-clients and flink-java?

 On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote:

 I see the reason why you want to add flink-scala as a dependency to
 flink-streaming-core. However, it does not feel right to add an API package
 to a core package IMHO.

 But I noticed that flink-streaming-core also depends on flink-java. Which
 seems odd to me as well. I'm not a streaming expert and thus cannot tell
 much about the reasons why a core package has a dependency on an API
 package but for me this looks more like an indicator for a necessary
 restructuring of our packages. Maybe someone working on the streaming parts
 can chime in and shed some light on the required dependencies.

 Cheers,
 Till


 On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote:

 Hello,

 I would like to ask if it would be OK if I added flink-scala as a
 dependency to flink-streaming-core. An alternative would be to move
 the Scala typeutils to flink-core (to where the Java typeutils are).
 Why I need this:

 While I am implementing the fast median calculation for windows as
 part of my Google Summer of Code project, I am refactoring the way
 sum, min, max, etc. are accessing the user-specified field
 (https://github.com/apache/flink/pull/684). Currently both the logic
 of their aggregators are duplicated for the different kinds of types
 (tuple, pojo, array, Scala case class, simple), and also the field
 access logic is duplicated across the different aggregators. In my
 GSoC project I will implement some further methods (avg, variance,
 etc.) that take the same kind of parameters as sum, min, etc., so it
 will be neccassary to have the field access logic centralized (this is
 the FieldAccessor class in the PR). It would be convenient if this
 could also handle Scala case classes, for which CaseClassTypeInfo is
 needed which is currently in flink-scala.

 Best regards,
 Gabor




Re: Adding flink-scala as a dependency to flink-streaming-core

2015-06-10 Thread Gábor Gévay
I see four options to solve this without adding the dependency:
1. Move CaseClassTypeInfo and CaseClassComparator to flink-core. Till
said that we want to avoid mixed Scala and Java modules, which rules
this out.
2. Create a new toplevel maven project scala-core, and move things there.
3. Hacky solution: What I actually need is just to determine if a
TypeInfo instance is a CaseClassTypeInfo. This could be achieved by
adding an isCaseClassTypeInfo() method to TupleTypeInfoBase returning
false, and override it in CaseClassTypeInfo to return true.
4. Reimplement CaseClassTypeInfo and CaseClassComparator in Java. I
hope that this can be avoided.

I think the decision boils down to whether similar problems will
probably arise in the future. If this is a unique issue, then 3. is
fine. Otherwise it won't be possible to avoid doing 2. later anyway.
Maybe someone who knows Flink better than me can estimate how likely
this is?

Best regards,
Gabor


2015-06-10 15:55 GMT+02:00 Gábor Gévay gga...@gmail.com:
 it does not feel right to add an API package to a core package

 Yes, that makes sense. I just tried removing the flink-java dependency
 from flink-streaming to see what needs it, and it builds fine without
 it :)

 What do you think about the second option? (to move the Scala
 typeutils (or just CaseClassTypeInfo and CaseClassComparator) to
 flink-core, to where the Java typeutils are) This would be mixing
 scala code to an otherwise java module, but I don't know whether that
 is a problem.

 Cheers,
 Gabor

 2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org:
 Btw: I noticed that all streaming modules depend on flink-core,
 flink-runtime, flink-clients and flink-java. Is there a particular reason
 why the streaming connectors depend on flink-clients and flink-java?

 On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote:

 I see the reason why you want to add flink-scala as a dependency to
 flink-streaming-core. However, it does not feel right to add an API package
 to a core package IMHO.

 But I noticed that flink-streaming-core also depends on flink-java. Which
 seems odd to me as well. I'm not a streaming expert and thus cannot tell
 much about the reasons why a core package has a dependency on an API
 package but for me this looks more like an indicator for a necessary
 restructuring of our packages. Maybe someone working on the streaming parts
 can chime in and shed some light on the required dependencies.

 Cheers,
 Till


 On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote:

 Hello,

 I would like to ask if it would be OK if I added flink-scala as a
 dependency to flink-streaming-core. An alternative would be to move
 the Scala typeutils to flink-core (to where the Java typeutils are).
 Why I need this:

 While I am implementing the fast median calculation for windows as
 part of my Google Summer of Code project, I am refactoring the way
 sum, min, max, etc. are accessing the user-specified field
 (https://github.com/apache/flink/pull/684). Currently both the logic
 of their aggregators are duplicated for the different kinds of types
 (tuple, pojo, array, Scala case class, simple), and also the field
 access logic is duplicated across the different aggregators. In my
 GSoC project I will implement some further methods (avg, variance,
 etc.) that take the same kind of parameters as sum, min, etc., so it
 will be neccassary to have the field access logic centralized (this is
 the FieldAccessor class in the PR). It would be convenient if this
 could also handle Scala case classes, for which CaseClassTypeInfo is
 needed which is currently in flink-scala.

 Best regards,
 Gabor