Re: Adding flink-scala as a dependency to flink-streaming-core
I like option 1 the most (move to flink-core), however, it would scatter the type extractor / type information classes accross multiple projects. Why are we not moving the entire type extractor system into flink-core? There were some thoughts in the past to restructure the maven module layout: https://issues.apache.org/jira/browse/FLINK-1712 .. I think the change would be in line with the move. On Wed, Jun 10, 2015 at 7:52 AM, Gábor Gévay gga...@gmail.com wrote: I see four options to solve this without adding the dependency: 1. Move CaseClassTypeInfo and CaseClassComparator to flink-core. Till said that we want to avoid mixed Scala and Java modules, which rules this out. 2. Create a new toplevel maven project scala-core, and move things there. 3. Hacky solution: What I actually need is just to determine if a TypeInfo instance is a CaseClassTypeInfo. This could be achieved by adding an isCaseClassTypeInfo() method to TupleTypeInfoBase returning false, and override it in CaseClassTypeInfo to return true. 4. Reimplement CaseClassTypeInfo and CaseClassComparator in Java. I hope that this can be avoided. I think the decision boils down to whether similar problems will probably arise in the future. If this is a unique issue, then 3. is fine. Otherwise it won't be possible to avoid doing 2. later anyway. Maybe someone who knows Flink better than me can estimate how likely this is? Best regards, Gabor 2015-06-10 15:55 GMT+02:00 Gábor Gévay gga...@gmail.com: it does not feel right to add an API package to a core package Yes, that makes sense. I just tried removing the flink-java dependency from flink-streaming to see what needs it, and it builds fine without it :) What do you think about the second option? (to move the Scala typeutils (or just CaseClassTypeInfo and CaseClassComparator) to flink-core, to where the Java typeutils are) This would be mixing scala code to an otherwise java module, but I don't know whether that is a problem. Cheers, Gabor 2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org: Btw: I noticed that all streaming modules depend on flink-core, flink-runtime, flink-clients and flink-java. Is there a particular reason why the streaming connectors depend on flink-clients and flink-java? On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote: I see the reason why you want to add flink-scala as a dependency to flink-streaming-core. However, it does not feel right to add an API package to a core package IMHO. But I noticed that flink-streaming-core also depends on flink-java. Which seems odd to me as well. I'm not a streaming expert and thus cannot tell much about the reasons why a core package has a dependency on an API package but for me this looks more like an indicator for a necessary restructuring of our packages. Maybe someone working on the streaming parts can chime in and shed some light on the required dependencies. Cheers, Till On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote: Hello, I would like to ask if it would be OK if I added flink-scala as a dependency to flink-streaming-core. An alternative would be to move the Scala typeutils to flink-core (to where the Java typeutils are). Why I need this: While I am implementing the fast median calculation for windows as part of my Google Summer of Code project, I am refactoring the way sum, min, max, etc. are accessing the user-specified field (https://github.com/apache/flink/pull/684). Currently both the logic of their aggregators are duplicated for the different kinds of types (tuple, pojo, array, Scala case class, simple), and also the field access logic is duplicated across the different aggregators. In my GSoC project I will implement some further methods (avg, variance, etc.) that take the same kind of parameters as sum, min, etc., so it will be neccassary to have the field access logic centralized (this is the FieldAccessor class in the PR). It would be convenient if this could also handle Scala case classes, for which CaseClassTypeInfo is needed which is currently in flink-scala. Best regards, Gabor
Re: Adding flink-scala as a dependency to flink-streaming-core
If they can be easily converted to Java code, that would be a good solution. On Wed, 10 Jun 2015 at 15:56 Gábor Gévay gga...@gmail.com wrote: it does not feel right to add an API package to a core package Yes, that makes sense. I just tried removing the flink-java dependency from flink-streaming to see what needs it, and it builds fine without it :) What do you think about the second option? (to move the Scala typeutils (or just CaseClassTypeInfo and CaseClassComparator) to flink-core, to where the Java typeutils are) This would be mixing scala code to an otherwise java module, but I don't know whether that is a problem. Cheers, Gabor 2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org: Btw: I noticed that all streaming modules depend on flink-core, flink-runtime, flink-clients and flink-java. Is there a particular reason why the streaming connectors depend on flink-clients and flink-java? On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote: I see the reason why you want to add flink-scala as a dependency to flink-streaming-core. However, it does not feel right to add an API package to a core package IMHO. But I noticed that flink-streaming-core also depends on flink-java. Which seems odd to me as well. I'm not a streaming expert and thus cannot tell much about the reasons why a core package has a dependency on an API package but for me this looks more like an indicator for a necessary restructuring of our packages. Maybe someone working on the streaming parts can chime in and shed some light on the required dependencies. Cheers, Till On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote: Hello, I would like to ask if it would be OK if I added flink-scala as a dependency to flink-streaming-core. An alternative would be to move the Scala typeutils to flink-core (to where the Java typeutils are). Why I need this: While I am implementing the fast median calculation for windows as part of my Google Summer of Code project, I am refactoring the way sum, min, max, etc. are accessing the user-specified field (https://github.com/apache/flink/pull/684). Currently both the logic of their aggregators are duplicated for the different kinds of types (tuple, pojo, array, Scala case class, simple), and also the field access logic is duplicated across the different aggregators. In my GSoC project I will implement some further methods (avg, variance, etc.) that take the same kind of parameters as sum, min, etc., so it will be neccassary to have the field access logic centralized (this is the FieldAccessor class in the PR). It would be convenient if this could also handle Scala case classes, for which CaseClassTypeInfo is needed which is currently in flink-scala. Best regards, Gabor
Re: Adding flink-scala as a dependency to flink-streaming-core
Btw: I noticed that all streaming modules depend on flink-core, flink-runtime, flink-clients and flink-java. Is there a particular reason why the streaming connectors depend on flink-clients and flink-java? On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote: I see the reason why you want to add flink-scala as a dependency to flink-streaming-core. However, it does not feel right to add an API package to a core package IMHO. But I noticed that flink-streaming-core also depends on flink-java. Which seems odd to me as well. I'm not a streaming expert and thus cannot tell much about the reasons why a core package has a dependency on an API package but for me this looks more like an indicator for a necessary restructuring of our packages. Maybe someone working on the streaming parts can chime in and shed some light on the required dependencies. Cheers, Till On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote: Hello, I would like to ask if it would be OK if I added flink-scala as a dependency to flink-streaming-core. An alternative would be to move the Scala typeutils to flink-core (to where the Java typeutils are). Why I need this: While I am implementing the fast median calculation for windows as part of my Google Summer of Code project, I am refactoring the way sum, min, max, etc. are accessing the user-specified field (https://github.com/apache/flink/pull/684). Currently both the logic of their aggregators are duplicated for the different kinds of types (tuple, pojo, array, Scala case class, simple), and also the field access logic is duplicated across the different aggregators. In my GSoC project I will implement some further methods (avg, variance, etc.) that take the same kind of parameters as sum, min, etc., so it will be neccassary to have the field access logic centralized (this is the FieldAccessor class in the PR). It would be convenient if this could also handle Scala case classes, for which CaseClassTypeInfo is needed which is currently in flink-scala. Best regards, Gabor
Re: Adding flink-scala as a dependency to flink-streaming-core
it does not feel right to add an API package to a core package Yes, that makes sense. I just tried removing the flink-java dependency from flink-streaming to see what needs it, and it builds fine without it :) What do you think about the second option? (to move the Scala typeutils (or just CaseClassTypeInfo and CaseClassComparator) to flink-core, to where the Java typeutils are) This would be mixing scala code to an otherwise java module, but I don't know whether that is a problem. Cheers, Gabor 2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org: Btw: I noticed that all streaming modules depend on flink-core, flink-runtime, flink-clients and flink-java. Is there a particular reason why the streaming connectors depend on flink-clients and flink-java? On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote: I see the reason why you want to add flink-scala as a dependency to flink-streaming-core. However, it does not feel right to add an API package to a core package IMHO. But I noticed that flink-streaming-core also depends on flink-java. Which seems odd to me as well. I'm not a streaming expert and thus cannot tell much about the reasons why a core package has a dependency on an API package but for me this looks more like an indicator for a necessary restructuring of our packages. Maybe someone working on the streaming parts can chime in and shed some light on the required dependencies. Cheers, Till On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote: Hello, I would like to ask if it would be OK if I added flink-scala as a dependency to flink-streaming-core. An alternative would be to move the Scala typeutils to flink-core (to where the Java typeutils are). Why I need this: While I am implementing the fast median calculation for windows as part of my Google Summer of Code project, I am refactoring the way sum, min, max, etc. are accessing the user-specified field (https://github.com/apache/flink/pull/684). Currently both the logic of their aggregators are duplicated for the different kinds of types (tuple, pojo, array, Scala case class, simple), and also the field access logic is duplicated across the different aggregators. In my GSoC project I will implement some further methods (avg, variance, etc.) that take the same kind of parameters as sum, min, etc., so it will be neccassary to have the field access logic centralized (this is the FieldAccessor class in the PR). It would be convenient if this could also handle Scala case classes, for which CaseClassTypeInfo is needed which is currently in flink-scala. Best regards, Gabor
Re: Adding flink-scala as a dependency to flink-streaming-core
I see four options to solve this without adding the dependency: 1. Move CaseClassTypeInfo and CaseClassComparator to flink-core. Till said that we want to avoid mixed Scala and Java modules, which rules this out. 2. Create a new toplevel maven project scala-core, and move things there. 3. Hacky solution: What I actually need is just to determine if a TypeInfo instance is a CaseClassTypeInfo. This could be achieved by adding an isCaseClassTypeInfo() method to TupleTypeInfoBase returning false, and override it in CaseClassTypeInfo to return true. 4. Reimplement CaseClassTypeInfo and CaseClassComparator in Java. I hope that this can be avoided. I think the decision boils down to whether similar problems will probably arise in the future. If this is a unique issue, then 3. is fine. Otherwise it won't be possible to avoid doing 2. later anyway. Maybe someone who knows Flink better than me can estimate how likely this is? Best regards, Gabor 2015-06-10 15:55 GMT+02:00 Gábor Gévay gga...@gmail.com: it does not feel right to add an API package to a core package Yes, that makes sense. I just tried removing the flink-java dependency from flink-streaming to see what needs it, and it builds fine without it :) What do you think about the second option? (to move the Scala typeutils (or just CaseClassTypeInfo and CaseClassComparator) to flink-core, to where the Java typeutils are) This would be mixing scala code to an otherwise java module, but I don't know whether that is a problem. Cheers, Gabor 2015-06-10 15:46 GMT+02:00 Till Rohrmann trohrm...@apache.org: Btw: I noticed that all streaming modules depend on flink-core, flink-runtime, flink-clients and flink-java. Is there a particular reason why the streaming connectors depend on flink-clients and flink-java? On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann trohrm...@apache.org wrote: I see the reason why you want to add flink-scala as a dependency to flink-streaming-core. However, it does not feel right to add an API package to a core package IMHO. But I noticed that flink-streaming-core also depends on flink-java. Which seems odd to me as well. I'm not a streaming expert and thus cannot tell much about the reasons why a core package has a dependency on an API package but for me this looks more like an indicator for a necessary restructuring of our packages. Maybe someone working on the streaming parts can chime in and shed some light on the required dependencies. Cheers, Till On Wed, Jun 10, 2015 at 2:13 PM Gábor Gévay gga...@gmail.com wrote: Hello, I would like to ask if it would be OK if I added flink-scala as a dependency to flink-streaming-core. An alternative would be to move the Scala typeutils to flink-core (to where the Java typeutils are). Why I need this: While I am implementing the fast median calculation for windows as part of my Google Summer of Code project, I am refactoring the way sum, min, max, etc. are accessing the user-specified field (https://github.com/apache/flink/pull/684). Currently both the logic of their aggregators are duplicated for the different kinds of types (tuple, pojo, array, Scala case class, simple), and also the field access logic is duplicated across the different aggregators. In my GSoC project I will implement some further methods (avg, variance, etc.) that take the same kind of parameters as sum, min, etc., so it will be neccassary to have the field access logic centralized (this is the FieldAccessor class in the PR). It would be convenient if this could also handle Scala case classes, for which CaseClassTypeInfo is needed which is currently in flink-scala. Best regards, Gabor