Re: Scala Breeze Dependencies not resolving when adding flink-ml on build.sbt

2015-10-28 Thread Anwar Rizal
Yeah

I had similar problems with kafka in spark streaming. I worked around the
problem by excluding kafka from connector and then adding the library back.

Maybe you can try something like:

libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.1",
"org.apache.flink" % "flink-clients" % "0.9.1" ,"org.apache.flink" %
"flink-ml" % "0.9.1"  exclude("org.scalanlp",
"breeze_${scala.binary.version}"))

libraryDependencies += "org.scalanlp" % "breeze_2.10" % "0.11.2"

Anwar.




On Wed, Oct 28, 2015 at 11:29 AM, Frederick Ayala 
wrote:

> I tried adding libraryDependencies += "org.scalanlp" % "breeze_2.10" %
> "0.11.2"  but the problem persist.
>
> I also tried as explained in the Breeze documentation:
>
> libraryDependencies  ++= Seq(
>   "org.scalanlp" %% "breeze" % "0.11.2",
>   "org.scalanlp" %% "breeze-natives" % "0.11.2",
>   "org.scalanlp" %% "breeze-viz" % "0.11.2"
> )
>
> resolvers ++= Seq("Sonatype Releases" at "
> https://oss.sonatype.org/content/repositories/releases/;)
>
> But it doesn't work.
>
> The message is still "unresolved dependency:
> org.scalanlp#breeze_${scala.binary.version};0.11.2: not found"
>
> Could the problem be on flink-ml/pom.xml?
>
> 
> org.scalanlp
> breeze_${scala.binary.version}
> 0.11.2
> 
>
> The property scala.binary.version is not being replaced by the value 2.10
>
> Thanks,
>
> Frederick Ayala
>
> On Wed, Oct 28, 2015 at 10:59 AM, DEVAN M.S.  wrote:
>
>> Can you add libraryDependencies += "org.scalanlp" % "breeze_2.10" %
>> "0.11.2" also ?
>>
>>
>>
>> Devan M.S. | Technical Lead | Cyber Security | AMRITA VISHWA VIDYAPEETHAM
>> | Amritapuri | Cell +919946535290 |
>> [image: View DEVAN M S's profile on LinkedIn]
>> 
>>
>>
>> On Wed, Oct 28, 2015 at 3:04 PM, Frederick Ayala <
>> frederickay...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am getting an error when adding flink-ml to the libraryDependencies on
>>> my build.sbt file:
>>>
>>> [error] (*:update) sbt.ResolveException: unresolved dependency:
>>> org.scalanlp#breeze_${scala.binary.version};0.11.2: not found
>>>
>>> My libraryDependencies is:
>>>
>>> libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" %
>>> "0.9.1", "org.apache.flink" % "flink-streaming-scala" % "0.9.1",
>>> "org.apache.flink" % "flink-clients" % "0.9.1",
>>> "org.apache.flink" % "flink-ml" % "0.9.1")
>>>
>>> I am using scalaVersion := "2.10.6"
>>>
>>> If I remove flink-ml all the other dependencies are resolved.
>>>
>>> Could you help me to figure out a solution for this?
>>>
>>> Thanks!
>>>
>>> Frederick Ayala
>>>
>>
>>
>
>
> --
> Frederick Ayala
>


Re: Scala Breeze Dependencies not resolving when adding flink-ml on build.sbt

2015-10-28 Thread Theodore Vasiloudis
This sounds similar to this problem:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-ML-as-Dependency-td1582.html

The reason is (quoting Till, replace gradle with sbt here):

the flink-ml pom contains as a dependency an artifact with artifactId
> breeze_${scala.binary.version}. The variable scala.binary.version is
> defined in the parent pom and not substituted when flink-ml is installed.
> Therefore gradle tries to find a dependency with the name
> breeze_${scala.binary.version}


Anwar's solution should work, I just tested it on a basic Flink build, but
I haven't tried running anything yet.
The resolution error does go away though. So your sbt should include
something like:

libraryDependencies ++= Seq(
  "org.apache.flink" % "flink-scala" % flinkVersion,
  "org.apache.flink" % "flink-clients" % flinkVersion,
  ("org.apache.flink" % "flink-ml" % flinkVersion)
.exclude("org.scalanlp", "breeze_${scala.binary.version}"),
  "org.scalanlp" %% "breeze" % "0.11.2")



On Wed, Oct 28, 2015 at 10:34 AM, Frederick Ayala 
wrote:

> Hi,
>
> I am getting an error when adding flink-ml to the libraryDependencies on
> my build.sbt file:
>
> [error] (*:update) sbt.ResolveException: unresolved dependency:
> org.scalanlp#breeze_${scala.binary.version};0.11.2: not found
>
> My libraryDependencies is:
>
> libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.1",
> "org.apache.flink" % "flink-streaming-scala" % "0.9.1", "org.apache.flink"
> % "flink-clients" % "0.9.1",
> "org.apache.flink" % "flink-ml" % "0.9.1")
>
> I am using scalaVersion := "2.10.6"
>
> If I remove flink-ml all the other dependencies are resolved.
>
> Could you help me to figure out a solution for this?
>
> Thanks!
>
> Frederick Ayala
>


Re: Scala Breeze Dependencies not resolving when adding flink-ml on build.sbt

2015-10-28 Thread Frederick Ayala
Thank you Anwar! That did the trick :)

On Wed, Oct 28, 2015 at 1:30 PM, Anwar Rizal  wrote:

> Yeah
>
> I had similar problems with kafka in spark streaming. I worked around the
> problem by excluding kafka from connector and then adding the library back.
>
> Maybe you can try something like:
>
> libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.1",
> "org.apache.flink" % "flink-clients" % "0.9.1" ,"org.apache.flink" %
> "flink-ml" % "0.9.1"  exclude("org.scalanlp",
> "breeze_${scala.binary.version}"))
>
> libraryDependencies += "org.scalanlp" % "breeze_2.10" % "0.11.2"
>
> Anwar.
>
>
>
>
> On Wed, Oct 28, 2015 at 11:29 AM, Frederick Ayala <
> frederickay...@gmail.com> wrote:
>
>> I tried adding libraryDependencies += "org.scalanlp" % "breeze_2.10" %
>> "0.11.2"  but the problem persist.
>>
>> I also tried as explained in the Breeze documentation:
>>
>> libraryDependencies  ++= Seq(
>>   "org.scalanlp" %% "breeze" % "0.11.2",
>>   "org.scalanlp" %% "breeze-natives" % "0.11.2",
>>   "org.scalanlp" %% "breeze-viz" % "0.11.2"
>> )
>>
>> resolvers ++= Seq("Sonatype Releases" at "
>> https://oss.sonatype.org/content/repositories/releases/;)
>>
>> But it doesn't work.
>>
>> The message is still "unresolved dependency:
>> org.scalanlp#breeze_${scala.binary.version};0.11.2: not found"
>>
>> Could the problem be on flink-ml/pom.xml?
>>
>> 
>> org.scalanlp
>> breeze_${scala.binary.version}
>> 0.11.2
>> 
>>
>> The property scala.binary.version is not being replaced by the value 2.10
>>
>> Thanks,
>>
>> Frederick Ayala
>>
>> On Wed, Oct 28, 2015 at 10:59 AM, DEVAN M.S.  wrote:
>>
>>> Can you add libraryDependencies += "org.scalanlp" % "breeze_2.10" %
>>> "0.11.2" also ?
>>>
>>>
>>>
>>> Devan M.S. | Technical Lead | Cyber Security | AMRITA VISHWA
>>> VIDYAPEETHAM | Amritapuri | Cell +919946535290 |
>>> [image: View DEVAN M S's profile on LinkedIn]
>>> 
>>>
>>>
>>> On Wed, Oct 28, 2015 at 3:04 PM, Frederick Ayala <
>>> frederickay...@gmail.com> wrote:
>>>
 Hi,

 I am getting an error when adding flink-ml to the libraryDependencies
 on my build.sbt file:

 [error] (*:update) sbt.ResolveException: unresolved dependency:
 org.scalanlp#breeze_${scala.binary.version};0.11.2: not found

 My libraryDependencies is:

 libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" %
 "0.9.1", "org.apache.flink" % "flink-streaming-scala" % "0.9.1",
 "org.apache.flink" % "flink-clients" % "0.9.1",
 "org.apache.flink" % "flink-ml" % "0.9.1")

 I am using scalaVersion := "2.10.6"

 If I remove flink-ml all the other dependencies are resolved.

 Could you help me to figure out a solution for this?

 Thanks!

 Frederick Ayala

>>>
>>>
>>
>>
>> --
>> Frederick Ayala
>>
>
>


-- 
Frederick Ayala


Re: Scala Breeze Dependencies not resolving when adding flink-ml on build.sbt

2015-10-28 Thread Frederick Ayala
Thanks Theodore. I can confirm that Anwar solution worked. My build.sbt
looks like:

libraryDependencies  ++= Seq(
  "org.scalanlp" %% "breeze" % "0.11.2",
  "org.scalanlp" %% "breeze-natives" % "0.11.2",
  "org.scalanlp" %% "breeze-viz" % "0.11.2"
)

libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.1",
"org.apache.flink" % "flink-streaming-scala" % "0.9.1",
"org.apache.flink" % "flink-clients" % "0.9.1",
  "org.apache.flink" % "flink-ml" % "0.9.1"
exclude("org.scalanlp", "breeze_${scala.binary.version}")
)

resolvers ++= Seq("Sonatype Releases" at "
https://oss.sonatype.org/content/repositories/releases/;)



On Wed, Oct 28, 2015 at 1:41 PM, Theodore Vasiloudis <
theodoros.vasilou...@gmail.com> wrote:

> This sounds similar to this problem:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-ML-as-Dependency-td1582.html
>
> The reason is (quoting Till, replace gradle with sbt here):
>
> the flink-ml pom contains as a dependency an artifact with artifactId
>> breeze_${scala.binary.version}. The variable scala.binary.version is
>> defined in the parent pom and not substituted when flink-ml is installed.
>> Therefore gradle tries to find a dependency with the name
>> breeze_${scala.binary.version}
>
>
> Anwar's solution should work, I just tested it on a basic Flink build, but
> I haven't tried running anything yet.
> The resolution error does go away though. So your sbt should include
> something like:
>
> libraryDependencies ++= Seq(
>   "org.apache.flink" % "flink-scala" % flinkVersion,
>   "org.apache.flink" % "flink-clients" % flinkVersion,
>   ("org.apache.flink" % "flink-ml" % flinkVersion)
> .exclude("org.scalanlp", "breeze_${scala.binary.version}"),
>   "org.scalanlp" %% "breeze" % "0.11.2")
>
>
>
> On Wed, Oct 28, 2015 at 10:34 AM, Frederick Ayala <
> frederickay...@gmail.com> wrote:
>
>> Hi,
>>
>> I am getting an error when adding flink-ml to the libraryDependencies on
>> my build.sbt file:
>>
>> [error] (*:update) sbt.ResolveException: unresolved dependency:
>> org.scalanlp#breeze_${scala.binary.version};0.11.2: not found
>>
>> My libraryDependencies is:
>>
>> libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.1",
>> "org.apache.flink" % "flink-streaming-scala" % "0.9.1", "org.apache.flink"
>> % "flink-clients" % "0.9.1",
>> "org.apache.flink" % "flink-ml" % "0.9.1")
>>
>> I am using scalaVersion := "2.10.6"
>>
>> If I remove flink-ml all the other dependencies are resolved.
>>
>> Could you help me to figure out a solution for this?
>>
>> Thanks!
>>
>> Frederick Ayala
>>
>
>


-- 
Frederick Ayala


Re: compile flink-gelly-scala using sbt

2015-10-28 Thread Theodore Vasiloudis
Your build.sbt seems correct.
It might be that you are missing some basic imports.

In your code have you imported

import org.apache.flink.api.scala._

?


On Tue, Oct 27, 2015 at 8:45 PM, Vasiliki Kalavri  wrote:

> Hi Do,
>
> I don't really have experience with sbt, but one thing that might cause
> problems is that your dependencies point to Flink 0.9.1 and gelly-scala
> wasn't part of that release. You can either try to use the 0.10-SNAPSHOT or
> wait a few days for the 0.10 release.
>
> Cheers,
> -Vasia.
>
> On 27 October 2015 at 18:05, Le Quoc Do  wrote:
>
>> Hi,
>>
>> I try to compile flink-gelly-scala using sbt. However, I got the
>> following error:
>>
>> *error]
>> /home/ubuntu/git/flink-learning/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala:42:
>> value getJavaEnv is not a member of
>> org.apache.flink.api.scala.ExecutionEnvironment*
>> *[error] wrapGraph(jg.Graph.fromDataSet[K, VV, EV](vertices.javaSet,
>> edges.javaSet, env.getJavaEnv))*
>> *[error]
>>^*
>> *[error]
>> /home/ubuntu/git/flink-learning/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala:51:
>> value getJavaEnv is not a member of
>> org.apache.flink.api.scala.ExecutionEnvironment*
>> *[error] wrapGraph(jg.Graph.fromDataSet[K, EV](edges.javaSet,
>> env.getJavaEnv))*
>>
>> The content of built.sbt file:
>>
>> *name := "flink-graph-metrics"*
>>
>> *version := "1.0"*
>>
>> *scalaVersion := "2.11.6"*
>>
>> *libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" %
>> "0.9.1", "org.apache.flink" % "flink-clients" % "0.9.1", "org.apache.flink"
>> % "flink-gelly"  % "0.9.1")*
>>
>> *fork in run := true*
>>
>>
>> Do you know how to fix this problem?
>>
>> Thanks,
>> Do
>>
>> ==
>> Le Quoc Do
>> Dresden University of Technology
>> Faculty of Computer Science
>> Institute for System Architecture
>> Systems Engineering Group
>> 01062 Dresden
>> E-Mail: d...@se.inf.tu-dresden.de
>>
>
>


Re: Scala Breeze Dependencies not resolving when adding flink-ml on build.sbt

2015-10-28 Thread DEVAN M.S.
Can you add libraryDependencies += "org.scalanlp" % "breeze_2.10" %
"0.11.2" also ?



Devan M.S. | Technical Lead | Cyber Security | AMRITA VISHWA VIDYAPEETHAM |
Amritapuri | Cell +919946535290 |
[image: View DEVAN M S's profile on LinkedIn]



On Wed, Oct 28, 2015 at 3:04 PM, Frederick Ayala 
wrote:

> Hi,
>
> I am getting an error when adding flink-ml to the libraryDependencies on
> my build.sbt file:
>
> [error] (*:update) sbt.ResolveException: unresolved dependency:
> org.scalanlp#breeze_${scala.binary.version};0.11.2: not found
>
> My libraryDependencies is:
>
> libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.1",
> "org.apache.flink" % "flink-streaming-scala" % "0.9.1", "org.apache.flink"
> % "flink-clients" % "0.9.1",
> "org.apache.flink" % "flink-ml" % "0.9.1")
>
> I am using scalaVersion := "2.10.6"
>
> If I remove flink-ml all the other dependencies are resolved.
>
> Could you help me to figure out a solution for this?
>
> Thanks!
>
> Frederick Ayala
>


Re: compile flink-gelly-scala using sbt

2015-10-28 Thread Theodore Vasiloudis
Could you share a minimal code example where this happens?

On Wed, Oct 28, 2015 at 4:22 PM, Le Quoc Do  wrote:

> Hi Theodore and Vasia,
>
> Thanks for your reply.
>
> I can compile my code by add dependency jars manually.
>
> Yes, in my code, I already import Flink-scala (import 
> org.apache.flink.api.scala._).
> However when I run my code,
> I get the following error:
>
> *ava.lang.NoSuchMethodError:
> org.apache.flink.api.scala.ExecutionEnvironment.getJavaEnv()Lorg/apache/flink/api/java/ExecutionEnvironment;
> at org.apache.flink.graph.scala.Graph$.fromDataSet(Graph.scala:53)*
>
> any suggestions?
>
> Thanks,
> Do
>
> ==
> Le Quoc Do
> Dresden University of Technology
> Faculty of Computer Science
> Institute for System Architecture
> Systems Engineering Group
> 01062 Dresden
> E-Mail: d...@se.inf.tu-dresden.de
>
> On Wed, Oct 28, 2015 at 3:50 PM, Theodore Vasiloudis <
> theodoros.vasilou...@gmail.com> wrote:
>
>> Your build.sbt seems correct.
>> It might be that you are missing some basic imports.
>>
>> In your code have you imported
>>
>> import org.apache.flink.api.scala._
>>
>> ?
>>
>>
>> On Tue, Oct 27, 2015 at 8:45 PM, Vasiliki Kalavri <
>> vasilikikala...@gmail.com> wrote:
>>
>>> Hi Do,
>>>
>>> I don't really have experience with sbt, but one thing that might cause
>>> problems is that your dependencies point to Flink 0.9.1 and gelly-scala
>>> wasn't part of that release. You can either try to use the 0.10-SNAPSHOT or
>>> wait a few days for the 0.10 release.
>>>
>>> Cheers,
>>> -Vasia.
>>>
>>> On 27 October 2015 at 18:05, Le Quoc Do  wrote:
>>>
 Hi,

 I try to compile flink-gelly-scala using sbt. However, I got the
 following error:

 *error]
 /home/ubuntu/git/flink-learning/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala:42:
 value getJavaEnv is not a member of
 org.apache.flink.api.scala.ExecutionEnvironment*
 *[error] wrapGraph(jg.Graph.fromDataSet[K, VV,
 EV](vertices.javaSet, edges.javaSet, env.getJavaEnv))*
 *[error]
  ^*
 *[error]
 /home/ubuntu/git/flink-learning/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala:51:
 value getJavaEnv is not a member of
 org.apache.flink.api.scala.ExecutionEnvironment*
 *[error] wrapGraph(jg.Graph.fromDataSet[K, EV](edges.javaSet,
 env.getJavaEnv))*

 The content of built.sbt file:

 *name := "flink-graph-metrics"*

 *version := "1.0"*

 *scalaVersion := "2.11.6"*

 *libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" %
 "0.9.1", "org.apache.flink" % "flink-clients" % "0.9.1", "org.apache.flink"
 % "flink-gelly"  % "0.9.1")*

 *fork in run := true*


 Do you know how to fix this problem?

 Thanks,
 Do

 ==
 Le Quoc Do
 Dresden University of Technology
 Faculty of Computer Science
 Institute for System Architecture
 Systems Engineering Group
 01062 Dresden
 E-Mail: d...@se.inf.tu-dresden.de

>>>
>>>
>>
>


Re: compile flink-gelly-scala using sbt

2015-10-28 Thread Vasiliki Kalavri
Are you using 0.10-SNAPSHOT for this? Because in 0.9.1 this method
(getJavaEnv()) indeed doesn't exist ;)

On 28 October 2015 at 16:51, Le Quoc Do  wrote:

> From Graph.scala:
>
> *  def fromDataSet[K: TypeInformation : ClassTag, EV: TypeInformation :
> ClassTag]*
> *  (edges: DataSet[Edge[K, EV]], env: ExecutionEnvironment): Graph[K,
> NullValue, EV] = {*
> *wrapGraph(jg.Graph.fromDataSet[K, EV](edges.javaSet, env.getJavaEnv))*
> *  }*
>
>
> ==
> Le Quoc Do
> Dresden University of Technology
> Faculty of Computer Science
> Institute for System Architecture
> Systems Engineering Group
> 01062 Dresden
> E-Mail: d...@se.inf.tu-dresden.de
>
> On Wed, Oct 28, 2015 at 4:41 PM, Theodore Vasiloudis <
> theodoros.vasilou...@gmail.com> wrote:
>
>> Could you share a minimal code example where this happens?
>>
>> On Wed, Oct 28, 2015 at 4:22 PM, Le Quoc Do  wrote:
>>
>>> Hi Theodore and Vasia,
>>>
>>> Thanks for your reply.
>>>
>>> I can compile my code by add dependency jars manually.
>>>
>>> Yes, in my code, I already import Flink-scala (import 
>>> org.apache.flink.api.scala._).
>>> However when I run my code,
>>> I get the following error:
>>>
>>> *ava.lang.NoSuchMethodError:
>>> org.apache.flink.api.scala.ExecutionEnvironment.getJavaEnv()Lorg/apache/flink/api/java/ExecutionEnvironment;
>>> at org.apache.flink.graph.scala.Graph$.fromDataSet(Graph.scala:53)*
>>>
>>> any suggestions?
>>>
>>> Thanks,
>>> Do
>>>
>>> ==
>>> Le Quoc Do
>>> Dresden University of Technology
>>> Faculty of Computer Science
>>> Institute for System Architecture
>>> Systems Engineering Group
>>> 01062 Dresden
>>> E-Mail: d...@se.inf.tu-dresden.de
>>>
>>> On Wed, Oct 28, 2015 at 3:50 PM, Theodore Vasiloudis <
>>> theodoros.vasilou...@gmail.com> wrote:
>>>
 Your build.sbt seems correct.
 It might be that you are missing some basic imports.

 In your code have you imported

 import org.apache.flink.api.scala._

 ?


 On Tue, Oct 27, 2015 at 8:45 PM, Vasiliki Kalavri <
 vasilikikala...@gmail.com> wrote:

> Hi Do,
>
> I don't really have experience with sbt, but one thing that might
> cause problems is that your dependencies point to Flink 0.9.1 and
> gelly-scala wasn't part of that release. You can either try to use the
> 0.10-SNAPSHOT or wait a few days for the 0.10 release.
>
> Cheers,
> -Vasia.
>
> On 27 October 2015 at 18:05, Le Quoc Do  wrote:
>
>> Hi,
>>
>> I try to compile flink-gelly-scala using sbt. However, I got the
>> following error:
>>
>> *error]
>> /home/ubuntu/git/flink-learning/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala:42:
>> value getJavaEnv is not a member of
>> org.apache.flink.api.scala.ExecutionEnvironment*
>> *[error] wrapGraph(jg.Graph.fromDataSet[K, VV,
>> EV](vertices.javaSet, edges.javaSet, env.getJavaEnv))*
>> *[error]
>>^*
>> *[error]
>> /home/ubuntu/git/flink-learning/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala:51:
>> value getJavaEnv is not a member of
>> org.apache.flink.api.scala.ExecutionEnvironment*
>> *[error] wrapGraph(jg.Graph.fromDataSet[K, EV](edges.javaSet,
>> env.getJavaEnv))*
>>
>> The content of built.sbt file:
>>
>> *name := "flink-graph-metrics"*
>>
>> *version := "1.0"*
>>
>> *scalaVersion := "2.11.6"*
>>
>> *libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" %
>> "0.9.1", "org.apache.flink" % "flink-clients" % "0.9.1", 
>> "org.apache.flink"
>> % "flink-gelly"  % "0.9.1")*
>>
>> *fork in run := true*
>>
>>
>> Do you know how to fix this problem?
>>
>> Thanks,
>> Do
>>
>> ==
>> Le Quoc Do
>> Dresden University of Technology
>> Faculty of Computer Science
>> Institute for System Architecture
>> Systems Engineering Group
>> 01062 Dresden
>> E-Mail: d...@se.inf.tu-dresden.de
>>
>
>

>>>
>>
>


How best to deal with wide, structured tuples?

2015-10-28 Thread Johann Kovacs
Hi all,

I currently find myself evaluating a use case, where I have to deal
with wide (i.e. about 50-60 columns, definitely more than the 25
supported by the Tuple types), structured data from CSV files, with a
potentially dynamically (during runtime) generated (or automatically
inferred from the CSV file) schema.
SparkSQL works very well for this case, because I can generate or
infer the schema dynamically at runtime, access fields in UDFs via
index or name (via the Row API), generate new schemata for UDF results
on the fly, and use those schemata to read and write from/to CSV.
Obviously Spark and SparkSQL have other quirks and I'd like to find a
good solution to do this with Flink.

The main limitation seems to be that I can't seem to have DataSets of
arbitrary-length, arbitrary-type (i.e. unknown during compile time),
tuples. The Record API/type looks like it was meant to provide
something like that but it appears to become deprecated and is not
well supported by the DataSet APIs (e.g. I can't do a join on Records
by field index, nor does the CsvReader API support Records), and it
has no concept of field names, either.

I though about generating Java classes of my schemata on runtime (e.g.
via Javassist), but that seems like a hack, and I'd probably have to
do this for each intermediate schema as well (e.g. when a map
operation alters the schema). I haven't tried this avenue yet, so I'm
not certain it would actually work, and even less certain that this is
a nice and maintainable solution

Can anyone suggest a nice way to deal with this kind of use case? I
can prepare an example if that would make it more clear.

Thanks,
Johann


Fwd: Scala Breeze Dependencies not resolving when adding flink-ml on build.sbt

2015-10-28 Thread Frederick Ayala
Hi,

I am getting an error when adding flink-ml to the libraryDependencies on my
build.sbt file:

[error] (*:update) sbt.ResolveException: unresolved dependency:
org.scalanlp#breeze_${scala.binary.version};0.11.2: not found

My libraryDependencies is:

libraryDependencies ++= Seq("org.apache.flink" % "flink-scala" % "0.9.1",
"org.apache.flink" % "flink-streaming-scala" % "0.9.1", "org.apache.flink"
% "flink-clients" % "0.9.1",
"org.apache.flink" % "flink-ml" % "0.9.1")

I am using scalaVersion := "2.10.6"

If I remove flink-ml all the other dependencies are resolved.

Could you help me to figure out a solution for this?

Thanks!

Frederick Ayala