It would be nice to have a join variant that directly returns the value rathern than an option. Why not have both (they are wrapped as flatJoins anyway below, right?)
On Fri, Sep 12, 2014 at 11:50 AM, Fabian Hueske <[email protected]> wrote: > Sweet! I'm lovin' this :-) > > 2014-09-12 11:46 GMT+02:00 Aljoscha Krettek <[email protected]>: > > > Also, you can use CaseClasses directly as the type for CSV input. So > > instead of reading it as tuples and then having a mapper that maps to > > your case classes you can use: > > > > env.readCsv[Edge](...) > > > > On Fri, Sep 12, 2014 at 11:43 AM, Aljoscha Krettek <[email protected]> > > wrote: > > > I added support for specifying keys by name for CaseClasses. Check out > > > the PageRank and TriangleEnumeration examples to see it in action. > > > > > > @Kostas: I think you could use them for the TPC-H examples. > > > > > > On Fri, Sep 12, 2014 at 7:23 AM, Aljoscha Krettek <[email protected] > > > > wrote: > > >> Yes, that would allow list comprehensions. It would be possible to > > >> have the Collection signature for join (and coGroup), i.e.: > > >> > > >> apply[R]((T, O) => TraversableOnce[O]): DataSet[O] > > >> > > >> (T and O are the left and right input type, R is result type) > > >> > > >> Then you can return collections and still return an option, as in: > > >> > > >> a.join(b).where(0).equalTo(0) { (l, r) => if (r > ...) Some(l) else > > None } > > >> > > >> Because there is an implicit conversion from Options to a Collection. > > >> This will always wrap the return value in a List with only one value. > > >> I'm not sure we want the overhead here. I'm also not sure whether we > > >> want the overhead of always having to use an Option even though the > > >> join always returns a value. > > >> > > >> What do you think? > > >> > > >> On Thu, Sep 11, 2014 at 11:22 PM, Fabian Hueske <[email protected]> > > wrote: > > >>> Hmmm, tricky question... > > >>> How about the Option for Join as this is a tuple-wise operation and > the > > >>> Collection for Cogroup which is group-wise? > > >>> Could we in that case use list comprehensions in Cogroup functions? > > >>> > > >>> Or is that too much mixing? > > >>> > > >>> 2014-09-11 23:00 GMT+02:00 Aljoscha Krettek <[email protected]>: > > >>> > > >>>> I didn't look at the example either. > > >>>> > > >>>> Addings collections is easy, it's just that we can either have > > >>>> Collections or the Option, not both. > > >>>> > > >>>> For the coding style I followed this: > > >>>> > > https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide > , > > >>>> which itself is based on this: http://docs.scala-lang.org/style/. > It > > >>>> is different from the Java Code Guidelines we have in place, yes. > > >>>> > > >>>> On Thu, Sep 11, 2014 at 10:10 PM, Fabian Hueske <[email protected] > > > > >>>> wrote: > > >>>> > I haven't looked at the LineRank example in detail, but if you > > think that > > >>>> > it adds something new to the examples collection, we can certainly > > port > > >>>> it > > >>>> > also to Java. > > >>>> > I think the Option and Collector return types are sufficient right > > now > > >>>> but > > >>>> > if Collections are easy to add, go for it. ;-) > > >>>> > > > >>>> > Great that the Scala primitives are working! Also thanks for > adding > > >>>> > genSequence and adapting my examples. > > >>>> > Btw. does the codestyle not apply for Scala files or do we have a > > >>>> different > > >>>> > there? > > >>>> > > > >>>> > 2014-09-11 17:55 GMT+02:00 Aljoscha Krettek <[email protected] > >: > > >>>> > > > >>>> >> What about the LineRank example? We had that in Scala but never > > had a > > >>>> >> Java Example. > > >>>> >> > > >>>> >> On Thu, Sep 11, 2014 at 5:51 PM, Aljoscha Krettek < > > [email protected]> > > >>>> >> wrote: > > >>>> >> > Yes, I like that. For the ITCases I always just copied the Java > > >>>> ITCase. > > >>>> >> > > > >>>> >> > The only examples that are missing now are LinearRegression and > > the > > >>>> >> > relational stuff. > > >>>> >> > > > >>>> >> > On Thu, Sep 11, 2014 at 5:48 PM, Fabian Hueske < > > [email protected]> > > >>>> >> wrote: > > >>>> >> >> I just removed the old CountEdgeDegrees example. > > >>>> >> >> That was a preprocessing step for the TriangleEnumeration, and > > is now > > >>>> >> part > > >>>> >> >> of the new TriangleEnumerationOpt example. > > >>>> >> >> So I guess, we don't need to port that one. As I said before, > > I'd > > >>>> >> prefer to > > >>>> >> >> keep Java and Scala examples in sync. > > >>>> >> >> > > >>>> >> >> Cheers, Fabian > > >>>> >> >> > > >>>> >> >> 2014-09-11 17:40 GMT+02:00 Aljoscha Krettek < > > [email protected]>: > > >>>> >> >> > > >>>> >> >>> I added the PageRank example, thanks again fabian. :D > > >>>> >> >>> > > >>>> >> >>> Regarding the other stuff: > > >>>> >> >>> - There is a comment in DataSet.scala about including > > >>>> >> >>> org.apache.flink.api.scala._ because of the TypeInformation. > > >>>> >> >>> - I added generateSequence to ExecutionEnvironment. > > >>>> >> >>> - It is possible to use Scala Primitives in Array, I noticed > > it > > >>>> while > > >>>> >> >>> writing the tests, you probably had an older version of the > > code. > > >>>> >> >>> - Yes, using List and other Interfaces is not possible, this > > is > > >>>> also > > >>>> >> >>> a restriction in the Java API. > > >>>> >> >>> > > >>>> >> >>> What do you think about the interface of join and coGroup? > > Right > > >>>> now, > > >>>> >> >>> you can either use a lambda that returns an Option or the > > lambda > > >>>> with > > >>>> >> >>> the Collector. Originally I wanted to have also have a lambda > > that > > >>>> >> >>> returns a Collection, but due to type erasure this has the > > same type > > >>>> >> >>> as the lambda with the Option so I couldn't use it. There is > an > > >>>> >> >>> implicit conversion from Option to a Collection, so I could > > change > > >>>> it > > >>>> >> >>> without breaking the examples we have now. What do you think? > > >>>> >> >>> > > >>>> >> >>> So far we have ported: WordCount, KMeans, > ConnectedComponents, > > >>>> >> >>> WebLogAnalysis, TransitiveClosureNaive, > > >>>> TriangleEnumerationNaive/Opt, > > >>>> >> >>> PageRank > > >>>> >> >>> > > >>>> >> >>> These are the examples people called dibs on: > > >>>> >> >>> - BatchGradientDescent (Márton) (Should be a port of > > >>>> LinearRegression > > >>>> >> >>> Example from Java) > > >>>> >> >>> - ComputeEdgeDegrees (Hermann) > > >>>> >> >>> > > >>>> >> >>> Those are unclaimed (if I'm not mistaken): > > >>>> >> >>> - The relational Stuff > > >>>> >> >>> > > >>>> >> >>> On Thu, Sep 11, 2014 at 3:06 PM, Stephan Ewen < > > [email protected]> > > >>>> >> wrote: > > >>>> >> >>> > +1 for removing RelationQuery > > >>>> >> >>> > > > >>>> >> >>> > On Thu, Sep 11, 2014 at 3:04 PM, Aljoscha Krettek < > > >>>> >> [email protected]> > > >>>> >> >>> > wrote: > > >>>> >> >>> > > > >>>> >> >>> >> By the way, what was called BatchGradientDescent in the > > Scala > > >>>> >> examples > > >>>> >> >>> >> should be replaced by a port of the LinearRegression > > Example from > > >>>> >> >>> >> Java. I had them as two separate examples earlier. > > >>>> >> >>> >> > > >>>> >> >>> >> What about RelationalQuery and TPC-H-Q3. Any thoughts > about > > >>>> removing > > >>>> >> >>> >> RelationalQuery? > > >>>> >> >>> >> > > >>>> >> >>> >> On Thu, Sep 11, 2014 at 11:43 AM, Aljoscha Krettek < > > >>>> >> [email protected] > > >>>> >> >>> > > > >>>> >> >>> >> wrote: > > >>>> >> >>> >> > I added the Triangle Enumeration Examples, thanks > Fabian. > > >>>> >> >>> >> > > > >>>> >> >>> >> > So far we have ported: WordCount, KMeans, > > ConnectedComponents, > > >>>> >> >>> >> > WebLogAnalysis, TransitiveClosureNaive, > > >>>> >> TriangleEnumerationNaive/Opt > > >>>> >> >>> >> > > > >>>> >> >>> >> > These are the examples people called dibs on: > > >>>> >> >>> >> > - PageRank (Fabian) > > >>>> >> >>> >> > - BatchGradientDescent (Márton) > > >>>> >> >>> >> > - ComputeEdgeDegrees (Hermann) > > >>>> >> >>> >> > > > >>>> >> >>> >> > Those are unclaimed (if I'm not mistaken): > > >>>> >> >>> >> > - The relational Stuff > > >>>> >> >>> >> > - LinearRegression > > >>>> >> >>> >> > > > >>>> >> >>> >> > On Wed, Sep 10, 2014 at 6:04 PM, Aljoscha Krettek < > > >>>> >> >>> [email protected]> > > >>>> >> >>> >> wrote: > > >>>> >> >>> >> >> Thanks, I added it. I'll keep a running list of > > >>>> ported/unported > > >>>> >> >>> >> >> examples in my mails. I'll rename the java example > > package to > > >>>> >> >>> examples > > >>>> >> >>> >> >> once the Scala API merge is done. > > >>>> >> >>> >> >> > > >>>> >> >>> >> >> I think the termination criterion is fine as it is. > Just > > >>>> because > > >>>> >> >>> Scala > > >>>> >> >>> >> >> enables functional programming doesn't mean it's always > > the > > >>>> best > > >>>> >> >>> >> >> choice. :D > > >>>> >> >>> >> >> > > >>>> >> >>> >> >> So far we have ported: WordCount, KMeans, > > ConnectedComponents, > > >>>> >> >>> >> >> WebLogAnalysis, TransitiveClosureNaive > > >>>> >> >>> >> >> > > >>>> >> >>> >> >> These are the examples people called dibs on: > > >>>> >> >>> >> >> - TriangleEnumration and PageRank (Fabian) > > >>>> >> >>> >> >> - BatchGradientDescent (Márton) > > >>>> >> >>> >> >> - ComputeEdgeDegrees (Hermann) > > >>>> >> >>> >> >> > > >>>> >> >>> >> >> Those are unclaimed (if I'm not mistaken): > > >>>> >> >>> >> >> - The relational Stuff > > >>>> >> >>> >> >> - LinearRegression > > >>>> >> >>> >> >> > > >>>> >> >>> >> >> Cheers, > > >>>> >> >>> >> >> Aljoscha > > >>>> >> >>> >> >> > > >>>> >> >>> >> >> On Wed, Sep 10, 2014 at 4:23 PM, Kostas Tzoumas < > > >>>> >> [email protected] > > >>>> >> >>> > > > >>>> >> >>> >> wrote: > > >>>> >> >>> >> >>> Transitive closure here, I also added a termination > > criterion > > >>>> >> in the > > >>>> >> >>> >> Java > > >>>> >> >>> >> >>> version: > > >>>> >> >>> >> > > >>>> https://github.com/ktzoumas/incubator-flink/tree/tc-scala-example > > >>>> >> >>> >> >>> > > >>>> >> >>> >> >>> Perhaps you can make the termination criterion in > Scala > > more > > >>>> >> >>> >> functional? > > >>>> >> >>> >> >>> > > >>>> >> >>> >> >>> I noticed that the examples package name is > > example.java but > > >>>> >> >>> >> examples.scala > > >>>> >> >>> >> >>> > > >>>> >> >>> >> >>> Kostas > > >>>> >> >>> >> >>> > > >>>> >> >>> >> >>> On Tue, Sep 9, 2014 at 6:12 PM, Kostas Tzoumas < > > >>>> >> [email protected] > > >>>> >> >>> > > > >>>> >> >>> >> wrote: > > >>>> >> >>> >> >>>> > > >>>> >> >>> >> >>>> I'll take TransitiveClosure and PiEstimation (was not > > on > > >>>> your > > >>>> >> >>> list). > > >>>> >> >>> >> >>>> > > >>>> >> >>> >> >>>> If nobody volunteers for the relational stuff I can > > take > > >>>> those > > >>>> >> as > > >>>> >> >>> >> well. > > >>>> >> >>> >> >>>> > > >>>> >> >>> >> >>>> How about removing the "RelationalQuery" from both > > Scala and > > >>>> >> Java? > > >>>> >> >>> It > > >>>> >> >>> >> >>>> seems to be a proper subset of TPC-H Q3. Does it add > > some > > >>>> >> teaching > > >>>> >> >>> >> value on > > >>>> >> >>> >> >>>> top of TPC-H Q3? > > >>>> >> >>> >> >>>> > > >>>> >> >>> >> >>>> Kostas > > >>>> >> >>> >> >>>> > > >>>> >> >>> >> >>>> On Tue, Sep 9, 2014 at 5:57 PM, Aljoscha Krettek < > > >>>> >> >>> [email protected] > > >>>> >> >>> >> > > > >>>> >> >>> >> >>>> wrote: > > >>>> >> >>> >> >>>>> > > >>>> >> >>> >> >>>>> Thanks, I added it, along with an ITCase. > > >>>> >> >>> >> >>>>> > > >>>> >> >>> >> >>>>> So far we have ported: WordCount, KMeans, > > >>>> ConnectedComponents, > > >>>> >> >>> >> >>>>> WebLogAnalysis > > >>>> >> >>> >> >>>>> > > >>>> >> >>> >> >>>>> These are the examples people called dibs on: > > >>>> >> >>> >> >>>>> - TriangleEnumration and PageRank (Fabian) > > >>>> >> >>> >> >>>>> - BatchGradientDescent (Márton) > > >>>> >> >>> >> >>>>> - ComputeEdgeDegrees (Hermann) > > >>>> >> >>> >> >>>>> > > >>>> >> >>> >> >>>>> Those are unclaimed (if I'm not mistaken): > > >>>> >> >>> >> >>>>> - TransitiveClosure > > >>>> >> >>> >> >>>>> - The relational Stuff > > >>>> >> >>> >> >>>>> - LinearRegression > > >>>> >> >>> >> >>>>> > > >>>> >> >>> >> >>>>> Cheers, > > >>>> >> >>> >> >>>>> Aljoscha > > >>>> >> >>> >> >>>>> > > >>>> >> >>> >> >>>>> On Tue, Sep 9, 2014 at 5:21 PM, Kostas Tzoumas < > > >>>> >> >>> [email protected]> > > >>>> >> >>> >> >>>>> wrote: > > >>>> >> >>> >> >>>>> > WebLog here: > > >>>> >> >>> >> >>>>> > > > >>>> >> >>> >> >>>>> > > > >>>> >> >>> >> > > >>>> >> >>> > > >>>> >> > > >>>> > > > https://github.com/ktzoumas/incubator-flink/tree/webloganalysis-example-scala > > >>>> >> >>> >> >>>>> > > > >>>> >> >>> >> >>>>> > Do you need any more done? > > >>>> >> >>> >> >>>>> > > > >>>> >> >>> >> >>>>> > On Tue, Sep 9, 2014 at 3:08 PM, Aljoscha Krettek < > > >>>> >> >>> >> [email protected]> > > >>>> >> >>> >> >>>>> > wrote: > > >>>> >> >>> >> >>>>> > > > >>>> >> >>> >> >>>>> >> I added the ConnectedComponents Example from > Vasia. > > >>>> >> >>> >> >>>>> >> > > >>>> >> >>> >> >>>>> >> Keep 'em coming, people. :D > > >>>> >> >>> >> >>>>> >> > > >>>> >> >>> >> >>>>> >> On Mon, Sep 8, 2014 at 6:07 PM, Fabian Hueske < > > >>>> >> >>> [email protected] > > >>>> >> >>> >> > > > >>>> >> >>> >> >>>>> >> wrote: > > >>>> >> >>> >> >>>>> >> > Alright, will do. > > >>>> >> >>> >> >>>>> >> > Thanks! > > >>>> >> >>> >> >>>>> >> > > > >>>> >> >>> >> >>>>> >> > 2014-09-08 17:48 GMT+02:00 Aljoscha Krettek < > > >>>> >> >>> >> [email protected]>: > > >>>> >> >>> >> >>>>> >> > > > >>>> >> >>> >> >>>>> >> >> Ok people, executive decision. :D > > >>>> >> >>> >> >>>>> >> >> > > >>>> >> >>> >> >>>>> >> >> Please look at KMeansData.java and > > KMeans.scala. I'm > > >>>> >> storing > > >>>> >> >>> >> the > > >>>> >> >>> >> >>>>> >> >> data > > >>>> >> >>> >> >>>>> >> >> in multi-dimensional object arrays and then > > >>>> converting > > >>>> >> it to > > >>>> >> >>> >> the > > >>>> >> >>> >> >>>>> >> >> required Java or Scala objects. > > >>>> >> >>> >> >>>>> >> >> > > >>>> >> >>> >> >>>>> >> >> Also, I changed isEqualTo to equalTo to make > it > > >>>> >> consistent > > >>>> >> >>> >> with the > > >>>> >> >>> >> >>>>> >> >> Java > > >>>> >> >>> >> >>>>> >> >> API. > > >>>> >> >>> >> >>>>> >> >> > > >>>> >> >>> >> >>>>> >> >> Regarding Join (and coGroup). There is no need > > for a > > >>>> >> >>> keyword, > > >>>> >> >>> >> you > > >>>> >> >>> >> >>>>> >> >> can > > >>>> >> >>> >> >>>>> >> >> just write: > > >>>> >> >>> >> >>>>> >> >> > > >>>> >> >>> >> >>>>> >> >> left.join(right).where(0).equalTo(1) { (le, > re) > > => > > >>>> new > > >>>> >> >>> >> MyResult(le, > > >>>> >> >>> >> >>>>> >> >> re) > > >>>> >> >>> >> >>>>> >> } > > >>>> >> >>> >> >>>>> >> >> > > >>>> >> >>> >> >>>>> >> >> On Mon, Sep 8, 2014 at 2:07 PM, Fabian Hueske > < > > >>>> >> >>> >> [email protected]> > > >>>> >> >>> >> >>>>> >> wrote: > > >>>> >> >>> >> >>>>> >> >> > Aside from the DataSet issue, I also found > an > > >>>> >> >>> inconsistency > > >>>> >> >>> >> with > > >>>> >> >>> >> >>>>> >> >> > the > > >>>> >> >>> >> >>>>> >> Java > > >>>> >> >>> >> >>>>> >> >> > API. In Java join is done as: > > >>>> >> >>> >> >>>>> >> >> > > > >>>> >> >>> >> >>>>> >> >> > ds1.join(ds2).where(...).equalTo(...) > > >>>> >> >>> >> >>>>> >> >> > > > >>>> >> >>> >> >>>>> >> >> > where in the current Scala this is: > > >>>> >> >>> >> >>>>> >> >> > > > >>>> >> >>> >> >>>>> >> >> > ds1.join(d2).where(...).isEqualTo(...) > > >>>> >> >>> >> >>>>> >> >> > > > >>>> >> >>> >> >>>>> >> >> > isEqualTo() should be renamed to equalTo(), > > IMO. > > >>>> >> >>> >> >>>>> >> >> > Also, join (+cross and coGroup?) lacks the > > with() > > >>>> >> method > > >>>> >> >>> >> because > > >>>> >> >>> >> >>>>> >> "with" > > >>>> >> >>> >> >>>>> >> >> is > > >>>> >> >>> >> >>>>> >> >> > a keyword in Scala. Should be offer > something > > >>>> similar > > >>>> >> for > > >>>> >> >>> >> Scala > > >>>> >> >>> >> >>>>> >> >> > or go > > >>>> >> >>> >> >>>>> >> >> with > > >>>> >> >>> >> >>>>> >> >> > map() on Tuple2(left, right)? > > >>>> >> >>> >> >>>>> >> >> > > > >>>> >> >>> >> >>>>> >> >> > 2014-09-08 13:51 GMT+02:00 Stephan Ewen < > > >>>> >> [email protected] > > >>>> >> >>> >: > > >>>> >> >>> >> >>>>> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> Instead of Strings, Object[][] would work > as > > well. > > >>>> >> That > > >>>> >> >>> is a > > >>>> >> >>> >> >>>>> >> >> >> generic > > >>>> >> >>> >> >>>>> >> >> >> representation of a Tuple. > > >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> >> Alternatively, they could be stored as Java > > or > > >>>> Scala > > >>>> >> >>> Tuples, > > >>>> >> >>> >> >>>>> >> >> >> with a > > >>>> >> >>> >> >>>>> >> >> generic > > >>>> >> >>> >> >>>>> >> >> >> utility method to convert between the two. > > >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> >> On Mon, Sep 8, 2014 at 10:55 AM, Fabian > > Hueske > > >>>> >> >>> >> >>>>> >> >> >> <[email protected]> > > >>>> >> >>> >> >>>>> >> >> wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> >> > Yeah, I ran into the same problem... > > >>>> >> >>> >> >>>>> >> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > +1 for using Strings and parsing them, > but > > >>>> using > > >>>> >> the > > >>>> >> >>> >> >>>>> >> >> >> > CSVFormat > > >>>> >> >>> >> >>>>> >> won't > > >>>> >> >>> >> >>>>> >> >> >> work > > >>>> >> >>> >> >>>>> >> >> >> > because this is based on a > FileInputFormat. > > >>>> >> >>> >> >>>>> >> >> >> > So we would need to parse the Strings > > >>>> manually... > > >>>> >> >>> >> >>>>> >> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > 2014-09-08 10:35 GMT+02:00 Aljoscha > Krettek > > >>>> >> >>> >> >>>>> >> >> >> > <[email protected]>: > > >>>> >> >>> >> >>>>> >> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > Hi, > > >>>> >> >>> >> >>>>> >> >> >> > > on second thought. Maybe we should just > > change > > >>>> >> all > > >>>> >> >>> the > > >>>> >> >>> >> >>>>> >> >> >> > > example > > >>>> >> >>> >> >>>>> >> input > > >>>> >> >>> >> >>>>> >> >> >> > > data to strings and use CSV input > > formats in > > >>>> all > > >>>> >> the > > >>>> >> >>> >> >>>>> >> >> >> > > examples. > > >>>> >> >>> >> >>>>> >> What > > >>>> >> >>> >> >>>>> >> >> do > > >>>> >> >>> >> >>>>> >> >> >> > > you think? > > >>>> >> >>> >> >>>>> >> >> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > Cheers, > > >>>> >> >>> >> >>>>> >> >> >> > > Aljoscha > > >>>> >> >>> >> >>>>> >> >> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > On Mon, Sep 8, 2014 at 7:46 AM, > Aljoscha > > >>>> Krettek > > >>>> >> < > > >>>> >> >>> >> >>>>> >> >> [email protected]> > > >>>> >> >>> >> >>>>> >> >> >> > > wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > > Hi, > > >>>> >> >>> >> >>>>> >> >> >> > > > yes it's unfortunate that the data > > types are > > >>>> >> >>> >> incompatible. > > >>>> >> >>> >> >>>>> >> >> >> > > > I'm > > >>>> >> >>> >> >>>>> >> >> afraid > > >>>> >> >>> >> >>>>> >> >> >> > > > you have to to what you proposed: > move > > the > > >>>> >> data to > > >>>> >> >>> a > > >>>> >> >>> >> >>>>> >> >> >> > > > static > > >>>> >> >>> >> >>>>> >> field > > >>>> >> >>> >> >>>>> >> >> and > > >>>> >> >>> >> >>>>> >> >> >> > > > convert it in the > > getDefaultEdgeDataSet() > > >>>> >> method in > > >>>> >> >>> >> Scala. > > >>>> >> >>> >> >>>>> >> >> >> > > > It's > > >>>> >> >>> >> >>>>> >> >> not > > >>>> >> >>> >> >>>>> >> >> >> > > > nice, but copying would duplicate the > > data > > >>>> and > > >>>> >> >>> make it > > >>>> >> >>> >> >>>>> >> >> >> > > > easier > > >>>> >> >>> >> >>>>> >> for > > >>>> >> >>> >> >>>>> >> >> it > > >>>> >> >>> >> >>>>> >> >> >> > > > to go out of sync in the Java and > Scala > > >>>> >> versions. > > >>>> >> >>> >> >>>>> >> >> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > > What do the others think? This will > > probably > > >>>> >> occur > > >>>> >> >>> in > > >>>> >> >>> >> all > > >>>> >> >>> >> >>>>> >> >> >> > > > the > > >>>> >> >>> >> >>>>> >> >> >> examples. > > >>>> >> >>> >> >>>>> >> >> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > > Cheers, > > >>>> >> >>> >> >>>>> >> >> >> > > > Aljoscha > > >>>> >> >>> >> >>>>> >> >> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > > On Sun, Sep 7, 2014 at 10:04 PM, > > Vasiliki > > >>>> >> Kalavri > > >>>> >> >>> >> >>>>> >> >> >> > > > <[email protected]> wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >> Hey, > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >> I have ported the Connected > Components > > >>>> >> example, > > >>>> >> >>> but > > >>>> >> >>> >> I am > > >>>> >> >>> >> >>>>> >> >> >> > > >> not > > >>>> >> >>> >> >>>>> >> sure > > >>>> >> >>> >> >>>>> >> >> >> how > > >>>> >> >>> >> >>>>> >> >> >> > to > > >>>> >> >>> >> >>>>> >> >> >> > > >> reuse the example input data from > > >>>> >> java-examples. > > >>>> >> >>> >> >>>>> >> >> >> > > >> In the ConnectedComponentsData > class, > > the > > >>>> >> vertices > > >>>> >> >>> >> and > > >>>> >> >>> >> >>>>> >> >> >> > > >> edges > > >>>> >> >>> >> >>>>> >> data > > >>>> >> >>> >> >>>>> >> >> >> are > > >>>> >> >>> >> >>>>> >> >> >> > > >> produced by the methods > > >>>> >> getDefaultVertexDataSet() > > >>>> >> >>> >> >>>>> >> >> >> > > >> and getDefaultEdgeDataSet(), which > > take > > >>>> >> >>> >> >>>>> >> >> >> > > >> an > > >>>> >> org.apache.flink.api.java.ExecutionEnvironment > > >>>> >> >>> as > > >>>> >> >>> >> >>>>> >> parameter. > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >> One way is to provide public static > > fields > > >>>> >> (like > > >>>> >> >>> in > > >>>> >> >>> >> the > > >>>> >> >>> >> >>>>> >> >> >> WordCountData > > >>>> >> >>> >> >>>>> >> >> >> > > >> class), but this introduces a > > conversion > > >>>> >> >>> >> >>>>> >> >> >> > > >> from > > >>>> org.apache.flink.api.java.tuple.Tuple2 to > > >>>> >> >>> Scala > > >>>> >> >>> >> >>>>> >> >> >> > > >> tuple and > > >>>> >> >>> >> >>>>> >> >> from > > >>>> >> >>> >> >>>>> >> >> >> > > >> java.lang.Long to scala.Long and I > > guess > > >>>> this > > >>>> >> is > > >>>> >> >>> an > > >>>> >> >>> >> >>>>> >> unnecessary > > >>>> >> >>> >> >>>>> >> >> >> > > complexity > > >>>> >> >>> >> >>>>> >> >> >> > > >> for an example (?). > > >>>> >> >>> >> >>>>> >> >> >> > > >> Another way is, of course, to copy > the > > >>>> example > > >>>> >> >>> data > > >>>> >> >>> >> in > > >>>> >> >>> >> >>>>> >> >> >> > > >> the > > >>>> >> >>> >> >>>>> >> Scala > > >>>> >> >>> >> >>>>> >> >> >> > > example. > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >> Am I missing something here? > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >> Thanks! > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >> Cheers, > > >>>> >> >>> >> >>>>> >> >> >> > > >> V. > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >> On 5 September 2014 15:52, Aljoscha > > >>>> Krettek < > > >>>> >> >>> >> >>>>> >> [email protected] > > >>>> >> >>> >> >>>>> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> Alright, I updated my repo: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > >>>> >> >>> >> >>>>> >> >> > > >>>> >> >>> >> > > https://github.com/aljoscha/incubator-flink/commits/scala-rework > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> This now has a working WordCount > > example. > > >>>> >> It's > > >>>> >> >>> >> pretty > > >>>> >> >>> >> >>>>> >> >> >> > > >>> much a > > >>>> >> >>> >> >>>>> >> >> copy > > >>>> >> >>> >> >>>>> >> >> >> of > > >>>> >> >>> >> >>>>> >> >> >> > > >>> the Java example with some fixups > > for the > > >>>> >> syntax > > >>>> >> >>> and > > >>>> >> >>> >> >>>>> >> >> >> > > >>> lambda > > >>>> >> >>> >> >>>>> >> >> >> > functions. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> You'll also notice that I added the > > >>>> >> java-examples > > >>>> >> >>> >> as a > > >>>> >> >>> >> >>>>> >> >> dependency > > >>>> >> >>> >> >>>>> >> >> >> for > > >>>> >> >>> >> >>>>> >> >> >> > > >>> the scala-examples. I did this to > > reuse > > >>>> the > > >>>> >> >>> example > > >>>> >> >>> >> >>>>> >> >> >> > > >>> input > > >>>> >> >>> >> >>>>> >> data. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> When you ported a program you can > do > > a > > >>>> pull > > >>>> >> >>> request > > >>>> >> >>> >> >>>>> >> >> >> > > >>> against > > >>>> >> >>> >> >>>>> >> my > > >>>> >> >>> >> >>>>> >> >> repo > > >>>> >> >>> >> >>>>> >> >> >> > > >>> and I will collect the examples. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> Happy coding. :D > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> On Fri, Sep 5, 2014 at 12:19 PM, > > Hermann > > >>>> >> Gábor < > > >>>> >> >>> >> >>>>> >> >> >> [email protected] > > >>>> >> >>> >> >>>>> >> >> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > +1 > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > ComputeEdgeDegrees for me! > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > On Fri, Sep 5, 2014 at 11:44 AM, > > Márton > > >>>> >> >>> Balassi < > > >>>> >> >>> >> >>>>> >> >> >> > > >>> [email protected]> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> +1 > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> BatchGradientDescent for me :) > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> On Fri, Sep 5, 2014 at 11:15 AM, > > Kostas > > >>>> >> >>> Tzoumas < > > >>>> >> >>> >> >>>>> >> >> >> > > [email protected]> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > +1 > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > I go for WebLogAnalysis. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > My experience with Scala > > consists of > > >>>> >> going > > >>>> >> >>> >> through > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > a > > >>>> >> >>> >> >>>>> >> >> tutorial > > >>>> >> >>> >> >>>>> >> >> >> so > > >>>> >> >>> >> >>>>> >> >> >> > > this > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> will > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > be a good stress test both for > > me and > > >>>> >> the > > >>>> >> >>> new > > >>>> >> >>> >> API > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > :-) > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > On Thu, Sep 4, 2014 at 9:09 > PM, > > >>>> Vasiliki > > >>>> >> >>> >> Kalavri < > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > [email protected]> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > +1 for having other people > > >>>> implement > > >>>> >> the > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > examples! > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > Connected Components and > > Kmeans for > > >>>> >> me :) > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > -V. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > On 4 September 2014 21:03, > > Fabian > > >>>> >> Hueske < > > >>>> >> >>> >> >>>>> >> >> >> [email protected]> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > I go for > > TriangleEnumeration and > > >>>> >> >>> PageRank. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > Let's also do the examples > > >>>> similar > > >>>> >> to > > >>>> >> >>> the > > >>>> >> >>> >> Java > > >>>> >> >>> >> >>>>> >> >> examples: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > - running out-of-the-box > > without > > >>>> >> >>> parameters > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > - parameters for external > > data > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > - follow a similar code > > structure > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > 2014-09-04 20:56 GMT+02:00 > > >>>> Aljoscha > > >>>> >> >>> >> Krettek < > > >>>> >> >>> >> >>>>> >> >> >> > > [email protected] > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > Will do, then people can > > >>>> reserve > > >>>> >> their > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > favourite > > >>>> >> >>> >> >>>>> >> >> >> examples > > >>>> >> >>> >> >>>>> >> >> >> > > here. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > On Thu, Sep 4, 2014 at > > 8:55 PM, > > >>>> >> Fabian > > >>>> >> >>> >> Hueske > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > < > > >>>> >> >>> >> >>>>> >> >> >> > > >>> [email protected]> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > wrote: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > Hi, > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > I think having > examples > > >>>> >> implemented > > >>>> >> >>> by > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > different > > >>>> >> >>> >> >>>>> >> >> >> people > > >>>> >> >>> >> >>>>> >> >> >> > > >>> proved to > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > be > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > valuable in the past. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > I'd help with two or > > three > > >>>> >> examples. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > It might be helpful if > > you'd > > >>>> >> port a > > >>>> >> >>> >> simple > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > first > > >>>> >> >>> >> >>>>> >> >> one > > >>>> >> >>> >> >>>>> >> >> >> > such > > >>>> >> >>> >> >>>>> >> >> >> > > as > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > WordCount. > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > Fabian > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > 2014-09-04 18:47 > > GMT+02:00 > > >>>> >> Aljoscha > > >>>> >> >>> >> Krettek > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > < > > >>>> >> >>> >> >>>>> >> >> >> > > >>> [email protected] > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> >: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> Hi, > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> I have a working > > rewrite of > > >>>> the > > >>>> >> >>> Scala > > >>>> >> >>> >> API > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> here: > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> > > https://github.com/aljoscha/incubator-flink/commits/scala-rework > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> I'm hoping that I'll > > only > > >>>> have > > >>>> >> to > > >>>> >> >>> >> write > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> the > > >>>> >> >>> >> >>>>> >> tests > > >>>> >> >>> >> >>>>> >> >> and > > >>>> >> >>> >> >>>>> >> >> >> > > port > > >>>> >> >>> >> >>>>> >> >> >> > > >>> the > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> examples. Do you > think > > it > > >>>> makes > > >>>> >> >>> sense > > >>>> >> >>> >> to > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> let > > >>>> >> >>> >> >>>>> >> other > > >>>> >> >>> >> >>>>> >> >> >> > people > > >>>> >> >>> >> >>>>> >> >> >> > > >>> port > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> the > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> examples, so that > > someone > > >>>> else > > >>>> >> uses > > >>>> >> >>> >> it and > > >>>> >> >>> >> >>>>> >> maybe > > >>>> >> >>> >> >>>>> >> >> >> > notices > > >>>> >> >>> >> >>>>> >> >> >> > > some > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > quirks > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> in the API? > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> Cheers, > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> Aljoscha > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>> >> > > >>>> >> >>> >> >>>>> >> >> >> > > >>> > > >>>> >> >>> >> >>>>> >> >> >> > > > > >>>> >> >>> >> >>>>> >> >> >> > > > >>>> >> >>> >> >>>>> >> >> >> > > >>>> >> >>> >> >>>>> >> >> > > >>>> >> >>> >> >>>>> >> > > >>>> >> >>> >> >>>> > > >>>> >> >>> >> >>>> > > >>>> >> >>> >> >>> > > >>>> >> >>> >> > > >>>> >> >>> > > >>>> >> > > >>>> > > >
