Re: Flink ML linear regression issue

2015-09-18 Thread Till Rohrmann
Hi Alexey and Hanan,

one of FlinkML’s feature is the flexible pipelining mechanism. It allows
you to chain multiple transformers with a trailing predictor to form a data
analysis pipeline. In order to support multiple input types, the actual
program logic (matching for the type) is assembled at compile time by the
Scala compiler using implicits. That is also the reason why you see in Java
the fourth parameter fitOperation when calling
multipleLinearRegression.fit() which in Scala is an implicit parameter. In
theory, it is possible to construct the pipelines yourself in Java by
assembling explicitly the respective implicit operations. However, I would
refrain from doing so, because it is error prone and laborious.

At the moment, I don’t really see an easy solution how to port the
pipelining mechanism to Java (8), because of the missing feature of
implicits. However, what we could do is to provide fit, predict and
transform method which can be used without the chaining support. Then you
lose the pipelining, but you can do it manually by calling the methods
(e.g. fit and transform) for each stage. We could add a thin Java layer
which calls the Scala methods with the correctly instantiated operations.

Cheers,
Till
​

On Thu, Sep 17, 2015 at 7:05 PM, Alexey Sapozhnikov 
wrote:

> Hello everyone.
>
> Do you have a sample in Java how to implement Flink
> MultipleLinearRegression example?
> Scala is great, however we would like to see the exact example we could
> invoke it from Java if it is possible.
> Thanks and sorry for the interrupt.
>
>
>
> On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer  wrote:
>
> > Hi
> >
> > I'm using Flink ML 9.2.1 in order to perform a multiple linear regression
> > with a csv data file.
> >
> > The Scala sample code for it is pretty straightforward:
> > val mlr = MultipleLinearRegression()
> >
> > val parameters = ParameterMap()
> >
> > parameters.add(MultipleLinearRegression.Stepsize, 2.0)
> > parameters.add(MultipleLinearRegression.Iterations, 10)
> > parameters.add(MultipleLinearRegression.ConvergenceThreshold, 0.001)
> > val inputDS = env.fromCollection(data)
> >
> > mlr.fit(inputDS, parameters)
> >
> > When I'm using Java(8) the fit method includes 3 parameters
> > 1. dataset
> > 2.parameters
> > 3. object which implements -fitOperation interface
> >
> > multipleLinearRegression.fit(regressionDS, parameters,fitOperation);
> >
> > Is there a need to  implement the fitOperation interface which have been
> > already
> > implemented in Flinks ml source code.
> >
> > Another option is using MultipleLinearRegression.fitMLR() method ,but I
> > haven't found a way to pass the train dataset to it as a parameter or by
> > setter.
> >
> > I'll be more than happy if you could guide me how to implement it in Java
> >
> > Thanks
> >
> > Hanan Meyer
> >
> >
> >
> >
> >
>
>
> --
>
> *Regards*
>
> *Alexey Sapozhnikov*
> CTO& Co-Founder
> Scalabillit Inc
> Aba Even 10-C, Herzelia, Israel
> M : +972-52-2363823
> E : ale...@scalabill.it
> W : http://www.scalabill.it
> YT - https://youtu.be/9Rj309PTOFA
> Map:http://mapta.gs/Scalabillit
> Revolutionizing Proof-of-Concept
>


Re: Flink ML linear regression issue

2015-09-18 Thread Theodore Vasiloudis
+1, having the convenient creation of pipelines for Java is more of a long
term project, but we should make it possible to manually create pipelines
in Java.

On Fri, Sep 18, 2015 at 11:15 AM, Till Rohrmann 
wrote:

> Hi Alexey and Hanan,
>
> one of FlinkML’s feature is the flexible pipelining mechanism. It allows
> you to chain multiple transformers with a trailing predictor to form a data
> analysis pipeline. In order to support multiple input types, the actual
> program logic (matching for the type) is assembled at compile time by the
> Scala compiler using implicits. That is also the reason why you see in Java
> the fourth parameter fitOperation when calling
> multipleLinearRegression.fit() which in Scala is an implicit parameter. In
> theory, it is possible to construct the pipelines yourself in Java by
> assembling explicitly the respective implicit operations. However, I would
> refrain from doing so, because it is error prone and laborious.
>
> At the moment, I don’t really see an easy solution how to port the
> pipelining mechanism to Java (8), because of the missing feature of
> implicits. However, what we could do is to provide fit, predict and
> transform method which can be used without the chaining support. Then you
> lose the pipelining, but you can do it manually by calling the methods
> (e.g. fit and transform) for each stage. We could add a thin Java layer
> which calls the Scala methods with the correctly instantiated operations.
>
> Cheers,
> Till
> ​
>
> On Thu, Sep 17, 2015 at 7:05 PM, Alexey Sapozhnikov 
> wrote:
>
> > Hello everyone.
> >
> > Do you have a sample in Java how to implement Flink
> > MultipleLinearRegression example?
> > Scala is great, however we would like to see the exact example we could
> > invoke it from Java if it is possible.
> > Thanks and sorry for the interrupt.
> >
> >
> >
> > On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer  wrote:
> >
> > > Hi
> > >
> > > I'm using Flink ML 9.2.1 in order to perform a multiple linear
> regression
> > > with a csv data file.
> > >
> > > The Scala sample code for it is pretty straightforward:
> > > val mlr = MultipleLinearRegression()
> > >
> > > val parameters = ParameterMap()
> > >
> > > parameters.add(MultipleLinearRegression.Stepsize, 2.0)
> > > parameters.add(MultipleLinearRegression.Iterations, 10)
> > > parameters.add(MultipleLinearRegression.ConvergenceThreshold, 0.001)
> > > val inputDS = env.fromCollection(data)
> > >
> > > mlr.fit(inputDS, parameters)
> > >
> > > When I'm using Java(8) the fit method includes 3 parameters
> > > 1. dataset
> > > 2.parameters
> > > 3. object which implements -fitOperation interface
> > >
> > > multipleLinearRegression.fit(regressionDS, parameters,fitOperation);
> > >
> > > Is there a need to  implement the fitOperation interface which have
> been
> > > already
> > > implemented in Flinks ml source code.
> > >
> > > Another option is using MultipleLinearRegression.fitMLR() method ,but I
> > > haven't found a way to pass the train dataset to it as a parameter or
> by
> > > setter.
> > >
> > > I'll be more than happy if you could guide me how to implement it in
> Java
> > >
> > > Thanks
> > >
> > > Hanan Meyer
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> >
> > *Regards*
> >
> > *Alexey Sapozhnikov*
> > CTO& Co-Founder
> > Scalabillit Inc
> > Aba Even 10-C, Herzelia, Israel
> > M : +972-52-2363823
> > E : ale...@scalabill.it
> > W : http://www.scalabill.it
> > YT - https://youtu.be/9Rj309PTOFA
> > Map:http://mapta.gs/Scalabillit
> > Revolutionizing Proof-of-Concept
> >
>


RE: Flink ML linear regression issue

2015-09-18 Thread alexey
Thank you very much for the clarifications.

-Original Message-
From: Theodore Vasiloudis [mailto:theodoros.vasilou...@gmail.com] 
Sent: Friday, September 18, 2015 2:33 PM
To: dev@flink.apache.org
Cc: Hanan Meyer
Subject: Re: Flink ML linear regression issue

+1, having the convenient creation of pipelines for Java is more of a 
+long
term project, but we should make it possible to manually create pipelines in 
Java.

On Fri, Sep 18, 2015 at 11:15 AM, Till Rohrmann <till.rohrm...@gmail.com>
wrote:

> Hi Alexey and Hanan,
>
> one of FlinkML’s feature is the flexible pipelining mechanism. It 
> allows you to chain multiple transformers with a trailing predictor to 
> form a data analysis pipeline. In order to support multiple input 
> types, the actual program logic (matching for the type) is assembled 
> at compile time by the Scala compiler using implicits. That is also 
> the reason why you see in Java the fourth parameter fitOperation when 
> calling
> multipleLinearRegression.fit() which in Scala is an implicit 
> parameter. In theory, it is possible to construct the pipelines 
> yourself in Java by assembling explicitly the respective implicit 
> operations. However, I would refrain from doing so, because it is error prone 
> and laborious.
>
> At the moment, I don’t really see an easy solution how to port the 
> pipelining mechanism to Java (8), because of the missing feature of 
> implicits. However, what we could do is to provide fit, predict and 
> transform method which can be used without the chaining support. Then 
> you lose the pipelining, but you can do it manually by calling the 
> methods (e.g. fit and transform) for each stage. We could add a thin 
> Java layer which calls the Scala methods with the correctly instantiated 
> operations.
>
> Cheers,
> Till
> ​
>
> On Thu, Sep 17, 2015 at 7:05 PM, Alexey Sapozhnikov 
> <ale...@scalabill.it>
> wrote:
>
> > Hello everyone.
> >
> > Do you have a sample in Java how to implement Flink 
> > MultipleLinearRegression example?
> > Scala is great, however we would like to see the exact example we 
> > could invoke it from Java if it is possible.
> > Thanks and sorry for the interrupt.
> >
> >
> >
> > On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer <ha...@scalabill.it> wrote:
> >
> > > Hi
> > >
> > > I'm using Flink ML 9.2.1 in order to perform a multiple linear
> regression
> > > with a csv data file.
> > >
> > > The Scala sample code for it is pretty straightforward:
> > > val mlr = MultipleLinearRegression()
> > >
> > > val parameters = ParameterMap()
> > >
> > > parameters.add(MultipleLinearRegression.Stepsize, 2.0) 
> > > parameters.add(MultipleLinearRegression.Iterations, 10) 
> > > parameters.add(MultipleLinearRegression.ConvergenceThreshold, 
> > > 0.001) val inputDS = env.fromCollection(data)
> > >
> > > mlr.fit(inputDS, parameters)
> > >
> > > When I'm using Java(8) the fit method includes 3 parameters 1. 
> > > dataset 2.parameters 3. object which implements -fitOperation 
> > > interface
> > >
> > > multipleLinearRegression.fit(regressionDS, 
> > > parameters,fitOperation);
> > >
> > > Is there a need to  implement the fitOperation interface which 
> > > have
> been
> > > already
> > > implemented in Flinks ml source code.
> > >
> > > Another option is using MultipleLinearRegression.fitMLR() method 
> > > ,but I haven't found a way to pass the train dataset to it as a 
> > > parameter or
> by
> > > setter.
> > >
> > > I'll be more than happy if you could guide me how to implement it 
> > > in
> Java
> > >
> > > Thanks
> > >
> > > Hanan Meyer
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> >
> > *Regards*
> >
> > *Alexey Sapozhnikov*
> > CTO& Co-Founder
> > Scalabillit Inc
> > Aba Even 10-C, Herzelia, Israel
> > M : +972-52-2363823
> > E : ale...@scalabill.it
> > W : http://www.scalabill.it
> > YT - https://youtu.be/9Rj309PTOFA
> > Map:http://mapta.gs/Scalabillit
> > Revolutionizing Proof-of-Concept
> >
>



Re: Flink ML linear regression issue

2015-09-17 Thread Alexey Sapozhnikov
Hello everyone.

Do you have a sample in Java how to implement Flink
MultipleLinearRegression example?
Scala is great, however we would like to see the exact example we could
invoke it from Java if it is possible.
Thanks and sorry for the interrupt.



On Thu, Sep 17, 2015 at 4:27 PM, Hanan Meyer  wrote:

> Hi
>
> I'm using Flink ML 9.2.1 in order to perform a multiple linear regression
> with a csv data file.
>
> The Scala sample code for it is pretty straightforward:
> val mlr = MultipleLinearRegression()
>
> val parameters = ParameterMap()
>
> parameters.add(MultipleLinearRegression.Stepsize, 2.0)
> parameters.add(MultipleLinearRegression.Iterations, 10)
> parameters.add(MultipleLinearRegression.ConvergenceThreshold, 0.001)
> val inputDS = env.fromCollection(data)
>
> mlr.fit(inputDS, parameters)
>
> When I'm using Java(8) the fit method includes 3 parameters
> 1. dataset
> 2.parameters
> 3. object which implements -fitOperation interface
>
> multipleLinearRegression.fit(regressionDS, parameters,fitOperation);
>
> Is there a need to  implement the fitOperation interface which have been
> already
> implemented in Flinks ml source code.
>
> Another option is using MultipleLinearRegression.fitMLR() method ,but I
> haven't found a way to pass the train dataset to it as a parameter or by
> setter.
>
> I'll be more than happy if you could guide me how to implement it in Java
>
> Thanks
>
> Hanan Meyer
>
>
>
>
>


-- 

*Regards*

*Alexey Sapozhnikov*
CTO& Co-Founder
Scalabillit Inc
Aba Even 10-C, Herzelia, Israel
M : +972-52-2363823
E : ale...@scalabill.it
W : http://www.scalabill.it
YT - https://youtu.be/9Rj309PTOFA
Map:http://mapta.gs/Scalabillit
Revolutionizing Proof-of-Concept