[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2017-06-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066596#comment-16066596
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user coveralls commented on the issue:

https://github.com/apache/flink/pull/1099
  

[![Coverage 
Status](https://coveralls.io/builds/12169841/badge)](https://coveralls.io/builds/12169841)

Changes Unknown when pulling **784cbc1a7901c65719f92919a2f584b5636105bf on 
sachingoel0101:scala_utils_fix** into ** on apache:master**.



> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
> Fix For: 0.10.0
>
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14847300#comment-14847300
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1099


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805312#comment-14805312
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-141404362
  
I've already removed the line break. :)


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805310#comment-14805310
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-141403984
  
Will address Till's comment and merge this...


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805304#comment-14805304
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-141402933
  
LGTM.

+1 for merging :-)


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805300#comment-14805300
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r39838739
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala ---
@@ -0,0 +1,109 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.Utils
+import org.apache.flink.api.java.utils.{DataSetUtils => jutils}
+
+import _root_.scala.language.implicitConversions
+import _root_.scala.reflect.ClassTag
+
+package object utils {
+
+  /**
+   * This class provides simple utility methods for zipping elements in a 
data set with an index
+   * or with a unique identifier, sampling elements from a data set.
+   *
+   * @param self Data Set
+   */
+
--- End diff --

line break


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805277#comment-14805277
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-141395162
  
Looks good to me.

+1 to merge


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14805266#comment-14805266
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-141391475
  
Unrelated failures. Already filed jiras for those. 2700 and 2612.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803222#comment-14803222
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-141148749
  
Hey @StephanEwen, apologies for being too eager but is it possible to get 
this in soon?


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738315#comment-14738315
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-139137139
  
@StephanEwen , can you look this over again?


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734690#comment-14734690
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138530591
  
Travis passes successfully. I've squashed the commits.
This should be mergeable now.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734563#comment-14734563
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138504819
  
Ah. Thank you @aljoscha. Travis should pass. I've already pushed a fix.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734557#comment-14734557
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138503770
  
I took a stab at removing the implicit parameters here: 
https://github.com/aljoscha/flink/commit/e197ea4aba4005400bc80a5693f17ec2617bfae5
The tests are still running on Travis but I think it should work.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734553#comment-14734553
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138502471
  
I think I agree with that. I wasn't too happy about using implicit 
arguments here; we're constructing the type information explicitly anyway. 
Will push a commit in a while to change this.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734467#comment-14734467
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138488251
  
I think this looks good.

I personally like explicit arguments, even if it means sometimes explicitly 
summoning an implicit bound. It just makes it clearer where the type infos 
flow, which is not that easy to figure out ;-)

This is subject to debate, though. @tillrohrmann and @aljoscha may be of 
different opinion here...


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734461#comment-14734461
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38903902
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/UnfinishedCoGroupOperation.scala
 ---
@@ -61,33 +59,14 @@ class UnfinishedCoGroupOperation[L: ClassTag, R: 
ClassTag](
 
 // We have to use this hack, for some reason classOf[Array[T]] does 
not work.
 // Maybe because ObjectArrayTypeInfo does not accept the Scala Array 
as an array class.
-val leftArrayType =
+implicit val leftArrayType =
--- End diff --

These values should be explicitly provided, not implicitly. It makes the 
code much more understandable


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734458#comment-14734458
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38903737
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/package.scala ---
@@ -70,4 +73,27 @@ package object scala {
 }
 st(depth).toString
   }
+
+  def createTuple2TypeInformation[T1, T2]
+  (implicit t1: TypeInformation[T1], t2: TypeInformation[T2])
--- End diff --

Making this implicit seems dangerous, it should be explicitly provided.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734455#comment-14734455
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138485125
  
The type information generation works as a macro on the abstract syntax 
tree, that's why it cannot work on its own code (or any code in the same 
project, which is the same compilation unit).


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734101#comment-14734101
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138401226
  
@StephanEwen , I have created a separate function to create type 
information for 2-tuple.

One question though. Why is there a need to generate type information 
explicitly here? The `TypeInformationGen` class does have a case analysis for 
`Product` types. I may be very wrong, but `createTypeInformation` macro cannot 
be used anywhere inside the module itself but only after the module's been 
compiled. This is perhaps why `createTypeInformation` works in, say, 
`flink-ml`. 


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733866#comment-14733866
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138329755
  
I think the way to go is not to put the type info into the class, but into 
the methods, and create it as follows:

```scala
def zipWithIndex[T : TypeInformation : ClassTag](): DataSet[(Long, T)] = {
  val tInfo = implicitly[TypeInformation[T]]
  
  implicit val tupleTypeInformation = new CaseClassTypeInfo[(Long, T)](
  classOf[(Long, T)],
  Array(BasicTypeInfo.LONG_TYPE_INFO, tInfo),
  Seq(BasicTypeInfo.LONG_TYPE_INFO, tInfo),
  Array("_1", "_2"))

  wrap(jutils.zipWithIndex(self.javaSet)).map { t => (t.f0.toLong, t.f1) }
```

All the methods in the utils class should have parenthesis, they are not a 
side effect free getters after all.

Also, some tooling around creating Scala Tuple type information would be 
nice. I can see that there are more places where one would do that.




> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733840#comment-14733840
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38869305
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala ---
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import org.apache.flink.api.common.ExecutionConfig
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.common.typeutils.TypeSerializer
+import org.apache.flink.api.java.Utils
+import org.apache.flink.api.java.utils.{DataSetUtils => jutils}
+import org.apache.flink.api.scala.typeutils.{CaseClassSerializer, 
CaseClassTypeInfo}
+
+import _root_.scala.language.implicitConversions
+import _root_.scala.reflect.ClassTag
+
+package object utils {
+
+  /**
+   * This class provides simple utility methods for zipping elements in a 
data set with an index
+   * or with a unique identifier, sampling elements from a data set.
+   *
+   * @param self Data Set
+   */
+
+  implicit class DataSetUtils[T: TypeInformation: ClassTag](val self: 
DataSet[T]) {
+
+implicit val tupleTypeInformation = new CaseClassTypeInfo[(Long, T)](
+  classOf[(Long, T)],
--- End diff --

No. It always led to error I mentioned above.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733836#comment-14733836
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38869114
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala ---
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import org.apache.flink.api.common.ExecutionConfig
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.common.typeutils.TypeSerializer
+import org.apache.flink.api.java.Utils
+import org.apache.flink.api.java.utils.{DataSetUtils => jutils}
+import org.apache.flink.api.scala.typeutils.{CaseClassSerializer, 
CaseClassTypeInfo}
+
+import _root_.scala.language.implicitConversions
+import _root_.scala.reflect.ClassTag
+
+package object utils {
+
+  /**
+   * This class provides simple utility methods for zipping elements in a 
data set with an index
+   * or with a unique identifier, sampling elements from a data set.
+   *
+   * @param self Data Set
+   */
+
+  implicit class DataSetUtils[T: TypeInformation: ClassTag](val self: 
DataSet[T]) {
+
+implicit val tupleTypeInformation = new CaseClassTypeInfo[(Long, T)](
+  classOf[(Long, T)],
--- End diff --

Didn't `createTypeInformation[(Long, T)]` work?


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733827#comment-14733827
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38868511
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala ---
@@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import org.apache.flink.api.common.ExecutionConfig
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.common.typeutils.TypeSerializer
+import org.apache.flink.api.java.Utils
+import org.apache.flink.api.java.utils.{DataSetUtils => jutils}
+import org.apache.flink.api.scala.typeutils.{CaseClassSerializer, 
CaseClassTypeInfo}
+
+import _root_.scala.language.implicitConversions
+import _root_.scala.reflect.ClassTag
+
+package object utils {
+
+  /**
+   * This class provides simple utility methods for zipping elements in a 
data set with an index
+   * or with a unique identifier, sampling elements from a data set.
+   *
+   * @param self Data Set
+   */
+
+  implicit class DataSetUtils[T: TypeInformation: ClassTag](val self: 
DataSet[T]) {
+
+implicit val tupleTypeInformation = new CaseClassTypeInfo[(Long, T)](
+  classOf[(Long, T)],
--- End diff --

@StephanEwen , is this what you had in mind? 
Thanks a lot. Figuring this out cleared up a lot of things for me. :)


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733725#comment-14733725
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138303761
  
To be precise, directly before the `map` call. And you have to make declare 
the value as an implicit value. Otherwise, the map call won't find it.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733722#comment-14733722
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138302904
  
Ah, sorry, my bad, you should take the TypeInformation fro `T` from the 
call site.

You may need to manually create the type info for the tuple, from the `T` 
type info, by creating a case class type info for `Tuple2` with `Long` and `T`.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733719#comment-14733719
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138302608
  
I think for this, the type information should be passed from the call site, 
you should not need to create it explicitly.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733575#comment-14733575
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138277563
  
I get the following error on using that: 
`macro implementation not found: createTypeInformation (the most common 
reason is that you cannot use macro implementations in the same compilation run 
that defines them)`
The correct place to use that would be just before the `wrap` call though, 
right?


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733541#comment-14733541
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138269532
  
Have you tried constructing the type information explicitly 
`createTypeInformation[(Long, T)]`?


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733501#comment-14733501
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on the pull request:

https://github.com/apache/flink/pull/1099#issuecomment-138258862
  
I am unable to get rid of the implicit type information for the `zip` 
functions, presumably because the type information for `(Long,T)` isn't found.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733394#comment-14733394
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user tillrohrmann commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38842611
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.Utils
+import org.apache.flink.api.java.utils.{DataSetUtils => jutils}
+
+import _root_.scala.language.implicitConversions
+import _root_.scala.reflect.ClassTag
+
+package object utils {
+
+  /**
+   * This class provides simple utility methods for zipping elements in a 
data set with an index
+   * or with a unique identifier, sampling elements from a data set.
+   *
+   * @param self Data Set
+   */
+
+  implicit class DataSetUtils[T](val self: DataSet[T]) {
+
+/**
+ * Method that takes a set of subtask index, total number of elements 
mappings
+ * and assigns ids to all the elements from the input data set.
+ *
+ * @return a data set of tuple 2 consisting of consecutive ids and 
initial values.
+ */
+def zipWithIndex(implicit ti: TypeInformation[(Long, T)],
--- End diff --

Stephan suggested to remove the implicit parameter lists from all methods 
and write instead `implicit class DataSetUtils[T: TypeInformation: 
ClassTag](val self: DataSet[T])`. +1 for his suggestion.


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733385#comment-14733385
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user sachingoel0101 commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38842188
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.Utils
+import org.apache.flink.api.java.utils.{DataSetUtils => jutils}
+
+import _root_.scala.language.implicitConversions
+import _root_.scala.reflect.ClassTag
+
+package object utils {
+
+  /**
+   * This class provides simple utility methods for zipping elements in a 
data set with an index
+   * or with a unique identifier, sampling elements from a data set.
+   *
+   * @param self Data Set
+   */
+
+  implicit class DataSetUtils[T](val self: DataSet[T]) {
+
+/**
+ * Method that takes a set of subtask index, total number of elements 
mappings
+ * and assigns ids to all the elements from the input data set.
+ *
+ * @return a data set of tuple 2 consisting of consecutive ids and 
initial values.
+ */
+def zipWithIndex(implicit ti: TypeInformation[(Long, T)],
--- End diff --

I'm not sure I understand. I'm not familiar with implicit values and type 
information systems of scala as well as flink. 


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733374#comment-14733374
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

Github user StephanEwen commented on a diff in the pull request:

https://github.com/apache/flink/pull/1099#discussion_r38841668
  
--- Diff: 
flink-scala/src/main/scala/org/apache/flink/api/scala/utils/package.scala ---
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.scala
+
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.api.java.Utils
+import org.apache.flink.api.java.utils.{DataSetUtils => jutils}
+
+import _root_.scala.language.implicitConversions
+import _root_.scala.reflect.ClassTag
+
+package object utils {
+
+  /**
+   * This class provides simple utility methods for zipping elements in a 
data set with an index
+   * or with a unique identifier, sampling elements from a data set.
+   *
+   * @param self Data Set
+   */
+
+  implicit class DataSetUtils[T](val self: DataSet[T]) {
+
+/**
+ * Method that takes a set of subtask index, total number of elements 
mappings
+ * and assigns ids to all the elements from the input data set.
+ *
+ * @return a data set of tuple 2 consisting of consecutive ids and 
initial values.
+ */
+def zipWithIndex(implicit ti: TypeInformation[(Long, T)],
--- End diff --

Quick question: In most other parts of the Scala API, the TypeInformation 
is passed via context bounds. Even though that de-sugars to an implicit 
parameter, why not keep the style consistent over all functions?


> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2627) Make Scala Data Set utils easier to access

2015-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1478#comment-1478
 ] 

ASF GitHub Bot commented on FLINK-2627:
---

GitHub user sachingoel0101 opened a pull request:

https://github.com/apache/flink/pull/1099

[FLINK-2627][utils]Make Scala Data Set utils easier to access

Introduces a package object for Scala data set utils to simplify usage.
New usage: `import org.apache.flink.api.scala.utils._`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sachingoel0101/flink scala_utils_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1099.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1099


commit 7df1766278111583eac24c375ec7996f1854705b
Author: Sachin Goel 
Date:   2015-09-05T13:12:38Z

Makes scala utils access easier, in sync with Java utils accessor




> Make Scala Data Set utils easier to access
> --
>
> Key: FLINK-2627
> URL: https://issues.apache.org/jira/browse/FLINK-2627
> Project: Flink
>  Issue Type: Improvement
>  Components: Scala API
>Reporter: Sachin Goel
>Assignee: Sachin Goel
>Priority: Trivial
>
> Currently, to use the Scala Data Set utility functions, one needs to import 
> {{import org.apache.flink.api.scala.DataSetUtils.utilsToDataSet}}
> This is counter-intuitive, extra complicated and should be more in sync with 
> how Java utils are imported. I propose a package object which can allow 
> importing utils like
> {{import org.apache.flink.api.scala.utils._}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)