from:"Chiwan Park"

Re: [ANNOUNCE] New committer: Theodore Vasiloudis

2017-03-21 Thread Chiwan Park


Congratulations, Theo!

Regards,
Chiwan Park

On 03/22/2017 03:06 AM, Ted Yu wrote:

Congratulations !


On Tue, Mar 21, 2017 at 11:00 AM, Matthias J. Sax <mj...@apache.org> wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

Congrats!

On 3/21/17 8:59 AM, Greg Hogan wrote:

Welcome, Theo, and great to have you onboard with Flink and ML!



On Mar 21, 2017, at 4:35 AM, Robert Metzger <rmetz...@apache.org>
wrote:

Hi everybody,

On behalf of the PMC I am delighted to announce Theodore
Vasiloudis as a new Flink committer!

Theo has been a community member for a very long time and he is
one of the main drivers of the currently ongoing ML discussions
in Flink.


Welcome Theo and congratulations again for becoming a Flink
committer!


Regards, Robert



-BEGIN PGP SIGNATURE-
Comment: GPGTools - https://gpgtools.org

iQIYBAEBCgAGBQJY0WpUAAoJELz8Z8hxAGOiwsUP337d/4nIcb0OaUC6/S23HOJs
0VvQJtpZ1KpjimJAxLo6Z5LaSozbdSJJRBtkLp0bNdw3pTTiHGV7jRA+nPJW7/+7
qHMOhLYED5WxEzQZDyaBMauxYfOO9fbRXEfblLHnq3yXQQOTeJisx9rBxpiPTa+K
RPnkUZF/76RHiZXNggYpahqho9KwiARqiUWOJkuAiTM118a2Xj47vBNVekCs9YpI
5yKEH+8f9ADc8j1dHmQmpu9xKjMmm39SMJm7XSRGIrMFPwJmj3N94Uv75lEuMYJa
qs+RqKADAVRZHCZt6LdOb1uViLR9fN2/q14lKahNnK4V0TtRsyhZ7evwT4Myy3WW
bUuslmB1Wix0Ysq5T6s2STgwtHyITtT0A1Ur6BbJu6VKi5r4d37kluASrFHziwXL
kzTf4KFWfR6807VDh93TlAWG1ONkz72lZqGU103r8gFE08l3Wr95Vz6zXcLRkmy0
KaTAdMlJyys7Vtep9GvFHO1wzGSIEPAJ3TmfRSsWDKvVhGCXLPfX2aiqXvN++HQ3
rB+C8gYWIneTA1C9J/Sv0iLuK/M4Jq+WAQ090Z8C/5Tqi11C5Ez6g5g5Md3Ij2gi
OYvEcFbJlPAnvQ4vs8gBEwejerNYnsufVRKfPG6yV1F0iOmMPOm0eqEwLKVViPb2
Uxi4txWBrpsAHDk=
=mH4x
-END PGP SIGNATURE-

Re: Restructuring Javadoc and Scaladoc for libraries

2016-07-15 Thread Chiwan Park

Hi Robert,

Thanks for clarifying! I’ve filed this [1].

Regards,
Chiwan Park

[1]: https://issues.apache.org/jira/browse/FLINK-4223

> On Jul 14, 2016, at 9:56 PM, Robert Metzger <rmetz...@apache.org> wrote:
> 
> Hi Chiwan,
> 
> I think that's something we need to address. Probably the scaladoc plugin
> is not configured correctly everywhere.
> 
> On Thu, Jul 14, 2016 at 3:59 AM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
>> Hi all,
>> 
>> I just noticed some scaladocs (Gelly Scala API, Streaming Scala API, and
>> FlinkML) are missing in scaladoc page but found in javadoc page, even
>> though the APIs are for Scala. Is this intentional?
>> 
>> I think we have to move some documentation to scaladoc.
>> 
>> Regards,
>> Chiwan Park
>> 
>>

[jira] [Created] (FLINK-4223) Rearrange scaladoc and javadoc for Scala API

2016-07-15 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-4223:
--

 Summary: Rearrange scaladoc and javadoc for Scala API
 Key: FLINK-4223
 URL: https://issues.apache.org/jira/browse/FLINK-4223
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: Chiwan Park
Priority: Minor


Currently, some scaladocs for Scala API (Gelly Scala API, FlinkML, Streaming 
Scala API) are not in scaladoc but in javadoc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Restructuring Javadoc and Scaladoc for libraries

2016-07-13 Thread Chiwan Park

Hi all,

I just noticed some scaladocs (Gelly Scala API, Streaming Scala API, and 
FlinkML) are missing in scaladoc page but found in javadoc page, even though 
the APIs are for Scala. Is this intentional?

I think we have to move some documentation to scaladoc.

Regards,
Chiwan Park

Re: [ANNOUNCE] Build Issues Solved

2016-05-31 Thread Chiwan Park

Hi Stephan,

Yes, right. But KNNITSuite calls ExecutionEnvironment.getExecutionEnvironment 
only once [1]. I’m testing with moving method call of getExecutionEnvironment 
to each test case.

[1]: 
https://github.com/apache/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/nn/KNNITSuite.scala#L45

Regards,
Chiwan Park

> On May 31, 2016, at 7:09 PM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi Chiwan!
> 
> I think the Execution environment is not shared, because what the
> TestEnvironment sets is a Context Environment Factory. Every time you call
> "ExecutionEnvironment.getExecutionEnvironment()", you get a new environment.
> 
> Stephan
> 
> 
> On Tue, May 31, 2016 at 11:53 AM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
>> I’ve created a JIRA issue [1] related to KNN test cases. I will send a PR
>> for it.
>> 
>> From my investigation [2], cluster for ML tests have only one taskmanager
>> with 4 slots. Is 2048 insufficient for total number of network numbers? I
>> still think the problem is sharing ExecutionEnvironment between test cases.
>> 
>> [1]: https://issues.apache.org/jira/browse/FLINK-3994
>> [2]:
>> https://github.com/apache/flink/blob/master/flink-test-utils/src/test/scala/org/apache/flink/test/util/FlinkTestBase.scala#L56
>> 
>> Regards,
>> Chiwan Park
>> 
>>> On May 31, 2016, at 6:05 PM, Maximilian Michels <m...@apache.org> wrote:
>>> 
>>> Thanks Stephan for the synopsis of our last weeks test instability
>>> madness. It's sad to see the shortcomings of Maven test plugins but
>>> another lesson learned is that our testing infrastructure should get a
>>> bit more attention. We have reached a point several times where our
>>> tests where inherently instable. Now we saw that even more problems
>>> were hidden in the dark. I would like to see more maintenance
>>> dedicated to testing.
>>> 
>>> @Chiwan: Please, no hotfix! Please open a JIRA issue and a pull
>>> request with a systematic fix. Those things are too crucial to be
>>> fixed on the go. The problems is that Travis reports the number of
>>> processors to be "32" (which is used for the number of task slots in
>>> local execution). The network buffers are not adjusted accordingly. We
>>> should set them correctly in the MiniCluster. Also, we could define an
>>> upper limit to the number of task slots for tests.
>>> 
>>> On Tue, May 31, 2016 at 10:59 AM, Chiwan Park <chiwanp...@apache.org>
>> wrote:
>>>> I think that the tests fail because of sharing ExecutionEnvironment
>> between test cases. I’m not sure why it is problem, but it is only
>> difference between other ML tests.
>>>> 
>>>> I created a hotfix and pushed it to my repository. When it seems fixed
>> [1], I’ll merge the hotfix to master branch.
>>>> 
>>>> [1]: https://travis-ci.org/chiwanpark/flink/builds/134104491
>>>> 
>>>> Regards,
>>>> Chiwan Park
>>>> 
>>>>> On May 31, 2016, at 5:43 PM, Chiwan Park <chiwanp...@apache.org>
>> wrote:
>>>>> 
>>>>> Maybe it seems about KNN test case which is merged into yesterday.
>> I’ll look into ML test.
>>>>> 
>>>>> Regards,
>>>>> Chiwan Park
>>>>> 
>>>>>> On May 31, 2016, at 5:38 PM, Ufuk Celebi <u...@apache.org> wrote:
>>>>>> 
>>>>>> Currently, an ML test is reliably failing and occasionally some HA
>>>>>> tests. Is someone looking into the ML test?
>>>>>> 
>>>>>> For HA, I will revert a commit, which might cause the HA
>>>>>> instabilities. Till is working on a proper fix as far as I know.
>>>>>> 
>>>>>> On Tue, May 31, 2016 at 3:50 AM, Chiwan Park <chiwanp...@apache.org>
>> wrote:
>>>>>>> Thanks for the great work! :-)
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Chiwan Park
>>>>>>> 
>>>>>>>> On May 31, 2016, at 7:47 AM, Flavio Pompermaier <
>> pomperma...@okkam.it> wrote:
>>>>>>>> 
>>>>>>>> Awesome work guys!
>>>>>>>> And even more thanks for the detailed report...This troubleshooting
>> summary
>>>>>>>> will be undoubtedly useful for all our maven projects!
>>>>>>>> 
>>>>

Re: [ANNOUNCE] Build Issues Solved

2016-05-31 Thread Chiwan Park

I’ve created a JIRA issue [1] related to KNN test cases. I will send a PR for 
it.

From my investigation [2], cluster for ML tests have only one taskmanager with 
4 slots. Is 2048 insufficient for total number of network numbers? I still 
think the problem is sharing ExecutionEnvironment between test cases.

[1]: https://issues.apache.org/jira/browse/FLINK-3994
[2]: 
https://github.com/apache/flink/blob/master/flink-test-utils/src/test/scala/org/apache/flink/test/util/FlinkTestBase.scala#L56

Regards,
Chiwan Park

> On May 31, 2016, at 6:05 PM, Maximilian Michels <m...@apache.org> wrote:
> 
> Thanks Stephan for the synopsis of our last weeks test instability
> madness. It's sad to see the shortcomings of Maven test plugins but
> another lesson learned is that our testing infrastructure should get a
> bit more attention. We have reached a point several times where our
> tests where inherently instable. Now we saw that even more problems
> were hidden in the dark. I would like to see more maintenance
> dedicated to testing.
> 
> @Chiwan: Please, no hotfix! Please open a JIRA issue and a pull
> request with a systematic fix. Those things are too crucial to be
> fixed on the go. The problems is that Travis reports the number of
> processors to be "32" (which is used for the number of task slots in
> local execution). The network buffers are not adjusted accordingly. We
> should set them correctly in the MiniCluster. Also, we could define an
> upper limit to the number of task slots for tests.
> 
> On Tue, May 31, 2016 at 10:59 AM, Chiwan Park <chiwanp...@apache.org> wrote:
>> I think that the tests fail because of sharing ExecutionEnvironment between 
>> test cases. I’m not sure why it is problem, but it is only difference 
>> between other ML tests.
>> 
>> I created a hotfix and pushed it to my repository. When it seems fixed [1], 
>> I’ll merge the hotfix to master branch.
>> 
>> [1]: https://travis-ci.org/chiwanpark/flink/builds/134104491
>> 
>> Regards,
>> Chiwan Park
>> 
>>> On May 31, 2016, at 5:43 PM, Chiwan Park <chiwanp...@apache.org> wrote:
>>> 
>>> Maybe it seems about KNN test case which is merged into yesterday. I’ll 
>>> look into ML test.
>>> 
>>> Regards,
>>> Chiwan Park
>>> 
>>>> On May 31, 2016, at 5:38 PM, Ufuk Celebi <u...@apache.org> wrote:
>>>> 
>>>> Currently, an ML test is reliably failing and occasionally some HA
>>>> tests. Is someone looking into the ML test?
>>>> 
>>>> For HA, I will revert a commit, which might cause the HA
>>>> instabilities. Till is working on a proper fix as far as I know.
>>>> 
>>>> On Tue, May 31, 2016 at 3:50 AM, Chiwan Park <chiwanp...@apache.org> wrote:
>>>>> Thanks for the great work! :-)
>>>>> 
>>>>> Regards,
>>>>> Chiwan Park
>>>>> 
>>>>>> On May 31, 2016, at 7:47 AM, Flavio Pompermaier <pomperma...@okkam.it> 
>>>>>> wrote:
>>>>>> 
>>>>>> Awesome work guys!
>>>>>> And even more thanks for the detailed report...This troubleshooting 
>>>>>> summary
>>>>>> will be undoubtedly useful for all our maven projects!
>>>>>> 
>>>>>> Best,
>>>>>> Flavio
>>>>>> On 30 May 2016 23:47, "Ufuk Celebi" <u...@apache.org> wrote:
>>>>>> 
>>>>>>> Thanks for the effort, Max and Stephan! Happy to see the green light 
>>>>>>> again.
>>>>>>> 
>>>>>>> On Mon, May 30, 2016 at 11:03 PM, Stephan Ewen <se...@apache.org> wrote:
>>>>>>>> Hi all!
>>>>>>>> 
>>>>>>>> After a few weeks of terrible build issues, I am happy to announce that
>>>>>>> the
>>>>>>>> build works again properly, and we actually get meaningful CI results.
>>>>>>>> 
>>>>>>>> Here is a story in many acts, from builds deep red to bright green joy.
>>>>>>>> Kudos to Max, who did most of this troubleshooting. This evening, Max 
>>>>>>>> and
>>>>>>>> me debugged the final issue and got the build back on track.
>>>>>>>> 
>>>>>>>> --
>>>>>>>> The Journey
>>>>>>>> --
>>>>>>>> 
>>>>>&

[jira] [Created] (FLINK-3994) Instable KNNITSuite

2016-05-31 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-3994:
--

 Summary: Instable KNNITSuite
 Key: FLINK-3994
 URL: https://issues.apache.org/jira/browse/FLINK-3994
 Project: Flink
  Issue Type: Bug
  Components: Machine Learning Library, Tests
Affects Versions: 1.1.0
Reporter: Chiwan Park
Priority: Critical


KNNITSuite fails in Travis-CI with following error:

{code}
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:806)
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
  at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:752)
  at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
  at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
  at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
  at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
  at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
  at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
  ...
  Cause: java.io.IOException: Insufficient number of network buffers: required 
32, but only 4 available. The total number of network buffers is currently set 
to 2048. You can increase this number by setting the configuration key 
'taskmanager.network.numberOfBuffers'.
  at 
org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:196)
  at 
org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:327)
  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:497)
  at java.lang.Thread.run(Thread.java:745)
  ...
{code}

https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064237/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064236/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/134064235/log.txt
https://s3.amazonaws.com/archive.travis-ci.org/jobs/134052961/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [ANNOUNCE] Build Issues Solved

2016-05-31 Thread Chiwan Park

I think that the tests fail because of sharing ExecutionEnvironment between 
test cases. I’m not sure why it is problem, but it is only difference between 
other ML tests.

I created a hotfix and pushed it to my repository. When it seems fixed [1], 
I’ll merge the hotfix to master branch.

[1]: https://travis-ci.org/chiwanpark/flink/builds/134104491
 
Regards,
Chiwan Park

> On May 31, 2016, at 5:43 PM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
> Maybe it seems about KNN test case which is merged into yesterday. I’ll look 
> into ML test.
> 
> Regards,
> Chiwan Park
> 
>> On May 31, 2016, at 5:38 PM, Ufuk Celebi <u...@apache.org> wrote:
>> 
>> Currently, an ML test is reliably failing and occasionally some HA
>> tests. Is someone looking into the ML test?
>> 
>> For HA, I will revert a commit, which might cause the HA
>> instabilities. Till is working on a proper fix as far as I know.
>> 
>> On Tue, May 31, 2016 at 3:50 AM, Chiwan Park <chiwanp...@apache.org> wrote:
>>> Thanks for the great work! :-)
>>> 
>>> Regards,
>>> Chiwan Park
>>> 
>>>> On May 31, 2016, at 7:47 AM, Flavio Pompermaier <pomperma...@okkam.it> 
>>>> wrote:
>>>> 
>>>> Awesome work guys!
>>>> And even more thanks for the detailed report...This troubleshooting summary
>>>> will be undoubtedly useful for all our maven projects!
>>>> 
>>>> Best,
>>>> Flavio
>>>> On 30 May 2016 23:47, "Ufuk Celebi" <u...@apache.org> wrote:
>>>> 
>>>>> Thanks for the effort, Max and Stephan! Happy to see the green light 
>>>>> again.
>>>>> 
>>>>> On Mon, May 30, 2016 at 11:03 PM, Stephan Ewen <se...@apache.org> wrote:
>>>>>> Hi all!
>>>>>> 
>>>>>> After a few weeks of terrible build issues, I am happy to announce that
>>>>> the
>>>>>> build works again properly, and we actually get meaningful CI results.
>>>>>> 
>>>>>> Here is a story in many acts, from builds deep red to bright green joy.
>>>>>> Kudos to Max, who did most of this troubleshooting. This evening, Max and
>>>>>> me debugged the final issue and got the build back on track.
>>>>>> 
>>>>>> --
>>>>>> The Journey
>>>>>> --
>>>>>> 
>>>>>> (1) Failsafe Plugin
>>>>>> 
>>>>>> The Maven Failsafe Build Plugin had a critical bug due to which failed
>>>>>> tests did not result in a failed build.
>>>>>> 
>>>>>> That is a pretty bad bug for a plugin whose only task is to run tests and
>>>>>> fail the build if a test fails.
>>>>>> 
>>>>>> After we recognized that, we upgraded the Failsafe Plugin.
>>>>>> 
>>>>>> 
>>>>>> (2) Failsafe Plugin Dependency Issues
>>>>>> 
>>>>>> After the upgrade, the Failsafe Plugin behaved differently and did not
>>>>>> interoperate with Dependency Shading any more.
>>>>>> 
>>>>>> Because of that, we switched to the Surefire Plugin.
>>>>>> 
>>>>>> 
>>>>>> (3) Fixing all the issues introduced in the meantime
>>>>>> 
>>>>>> Naturally, a number of test instabilities had been introduced, which
>>>>> needed
>>>>>> to be fixed.
>>>>>> 
>>>>>> 
>>>>>> (4) Yarn Tests and Test Scope Refactoring
>>>>>> 
>>>>>> In the meantime, a Pull Request was merged that moved the Yarn Tests to
>>>>> the
>>>>>> test scope.
>>>>>> Because the configuration searched for tests in the "main" scope, no Yarn
>>>>>> tests were executed for a while, until the scope was fixed.
>>>>>> 
>>>>>> 
>>>>>> (5) Yarn Tests and JMX Metrics
>>>>>> 
>>>>>> After the Yarn Tests were re-activated, we saw them fail due to warnings
>>>>>> created by the newly introduced metrics code. We could fix that by
>>>>> updating
>>>>>> the metrics code and temporarily not registering JMX beans for all
>>>>> metrics.
>>>>>> 
>&

Re: [ANNOUNCE] Build Issues Solved

2016-05-30 Thread Chiwan Park

Thanks for the great work! :-)

Regards,
Chiwan Park

> On May 31, 2016, at 7:47 AM, Flavio Pompermaier <pomperma...@okkam.it> wrote:
> 
> Awesome work guys!
> And even more thanks for the detailed report...This troubleshooting summary
> will be undoubtedly useful for all our maven projects!
> 
> Best,
> Flavio
> On 30 May 2016 23:47, "Ufuk Celebi" <u...@apache.org> wrote:
> 
>> Thanks for the effort, Max and Stephan! Happy to see the green light again.
>> 
>> On Mon, May 30, 2016 at 11:03 PM, Stephan Ewen <se...@apache.org> wrote:
>>> Hi all!
>>> 
>>> After a few weeks of terrible build issues, I am happy to announce that
>> the
>>> build works again properly, and we actually get meaningful CI results.
>>> 
>>> Here is a story in many acts, from builds deep red to bright green joy.
>>> Kudos to Max, who did most of this troubleshooting. This evening, Max and
>>> me debugged the final issue and got the build back on track.
>>> 
>>> --
>>> The Journey
>>> --
>>> 
>>> (1) Failsafe Plugin
>>> 
>>> The Maven Failsafe Build Plugin had a critical bug due to which failed
>>> tests did not result in a failed build.
>>> 
>>> That is a pretty bad bug for a plugin whose only task is to run tests and
>>> fail the build if a test fails.
>>> 
>>> After we recognized that, we upgraded the Failsafe Plugin.
>>> 
>>> 
>>> (2) Failsafe Plugin Dependency Issues
>>> 
>>> After the upgrade, the Failsafe Plugin behaved differently and did not
>>> interoperate with Dependency Shading any more.
>>> 
>>> Because of that, we switched to the Surefire Plugin.
>>> 
>>> 
>>> (3) Fixing all the issues introduced in the meantime
>>> 
>>> Naturally, a number of test instabilities had been introduced, which
>> needed
>>> to be fixed.
>>> 
>>> 
>>> (4) Yarn Tests and Test Scope Refactoring
>>> 
>>> In the meantime, a Pull Request was merged that moved the Yarn Tests to
>> the
>>> test scope.
>>> Because the configuration searched for tests in the "main" scope, no Yarn
>>> tests were executed for a while, until the scope was fixed.
>>> 
>>> 
>>> (5) Yarn Tests and JMX Metrics
>>> 
>>> After the Yarn Tests were re-activated, we saw them fail due to warnings
>>> created by the newly introduced metrics code. We could fix that by
>> updating
>>> the metrics code and temporarily not registering JMX beans for all
>> metrics.
>>> 
>>> 
>>> (6) Yarn / Surefire Deadlock
>>> 
>>> Finally, some Yarn tests failed reliably in Maven (though not in the
>> IDE).
>>> It turned out that those test a command line interface that interacts
>> with
>>> the standard input stream.
>>> 
>>> The newly deployed Surefire Plugin uses standard input as well, for
>>> communication with forked JVMs. Since Surefire internally locks the
>>> standard input stream, the Yarn CLI cannot poll the standard input stream
>>> without locking up and stalling the tests.
>>> 
>>> We adjusted the tests and now the build happily builds again.
>>> 
>>> -
>>> Conclusions:
>>> -
>>> 
>>>  - CI is terribly crucial It took us weeks with the fallout of having a
>>> period of unreliably CI.
>>> 
>>>  - Maven could do a better job. A bug as crucial as the one that started
>>> our problem should not occur in a test plugin like surefire. Also, the
>>> constant change of semantics and dependency scopes is annoying. The
>>> semantic changes are subtle, but for a build as complex as Flink, they
>> make
>>> a difference.
>>> 
>>>  - File-based communication is rarely a good idea. The bug in the
>> failsafe
>>> plugin was caused by improper file-based communication, and some of our
>>> discovered instabilities as well.
>>> 
>>> Greetings,
>>> Stephan
>>> 
>>> 
>>> PS: Some issues and mysteries remain for us to solve: When we allow our
>>> metrics subsystem to register JMX beans, we see some tests failing due to
>>> spontaneous JVM process kills. Whoever has a pointer there, please ping
>> us!
>>

Re: Intellij code style

2016-05-12 Thread Chiwan Park

Please create a JIRA issue for this and send the PR with JIRA issue number.

Regards,
Chiwan Park

> On May 12, 2016, at 7:15 PM, Flavio Pompermaier <pomperma...@okkam.it> wrote:
> 
> Do I need to open also a Jira or just the PR?
> 
> On Thu, May 12, 2016 at 12:03 PM, Stephan Ewen <se...@apache.org> wrote:
> 
>> Yes, please open a pull request for that.
>> 
>> On Thu, May 12, 2016 at 11:40 AM, Flavio Pompermaier <pomperma...@okkam.it
>>> 
>> wrote:
>> 
>>> If you're interested to I created an Eclipse version that should follows
>>> Flink coding rules..should I create a new JIRA for it?
>>> 
>>> On Thu, May 5, 2016 at 6:02 PM, Dawid Wysakowicz <
>>> wysakowicz.da...@gmail.com
>>>> wrote:
>>> 
>>>> I opened JIRA: https://issues.apache.org/jira/browse/FLINK-3870. and
>>>> created PR both to flink and flink-web.
>>>> 
>>>> https://github.com/apache/flink/pull/1963
>>>> https://github.com/apache/flink-web/pull/20
>>>> 
>>>> I would be thankful for a review.
>>>> 
>>>> 2016-05-04 11:00 GMT+02:00 Fabian Hueske <fhue...@gmail.com>:
>>>> 
>>>>> Yes, please open a JIRA. Thanks!
>>>>> 
>>>>> 2016-05-04 10:16 GMT+02:00 Dawid Wysakowicz <
>>> wysakowicz.da...@gmail.com
>>>>> :
>>>>> 
>>>>>> Sure, Will open PR shortly. Shall I create any JIRA issue?
>>>>>> 
>>>>>> 2016-05-04 9:28 GMT+02:00 Fabian Hueske <fhue...@gmail.com>:
>>>>>> 
>>>>>>> +1 for adding a template to the tools folder and linking it from
>>> the
>>>>>> coding
>>>>>>> guide lines!
>>>>>>> 
>>>>>>> 2016-05-04 6:08 GMT+02:00 Henry Saputra <henry.sapu...@gmail.com
>>> :
>>>>>>> 
>>>>>>>> We could actually put this in the tools directory of the source
>>> and
>>>>>> repo
>>>>>>>> and refer it from contribution guide.
>>>>>>>> 
>>>>>>>> @Dawid want to try to send Pull request for it?
>>>>>>>> 
>>>>>>>> On Thursday, April 28, 2016, Theodore Vasiloudis <
>>>>>>>> theodoros.vasilou...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>>> Do we plan to include something like this in the contribution
>>>> guide
>>>>>> as
>>>>>>>>> well?
>>>>>>>>> 
>>>>>>>>> On Thu, Apr 28, 2016 at 3:16 PM, Stefano Baghino <
>>>>>>>>> stefano.bagh...@radicalbit.io <javascript:;>> wrote:
>>>>>>>>> 
>>>>>>>>>> Awesome Dawid! Thanks for taking the time to do this. :)
>>>>>>>>>> 
>>>>>>>>>> On Thu, Apr 28, 2016 at 1:45 PM, Dawid Wysakowicz <
>>>>>>>>>> wysakowicz.da...@gmail.com <javascript:;>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I tried to create a code style that would follow Flink
>>>>>> code-style.
>>>>>>> It
>>>>>>>>> may
>>>>>>>>>>> be not "production" ready, but I think it can be a good
>>>> start.
>>>>>>>>>>> Hope it will be useful for someone. Also I will be glad
>> for
>>>> any
>>>>>>>>> comments
>>>>>>>>>>> on that.
>>>>>>>>>>> 
>>>>>>>>>>> 2016-04-10 13:59 GMT+02:00 Stephan Ewen <
>> se...@apache.org
>>>>>>>>> <javascript:;>>:
>>>>>>>>>>> 
>>>>>>>>>>>> I don't know how close Phoenix' code style is to Flink's
>>>>>> de-facto
>>>>>>>> code
>>>>>>>>>>>> style.
>>>>>>>>>>>> I would create one that reflects Flink's de-facto code
>>>> style,
>>>>> so
>>>>>>>> that
>>>>>>>>>> the
>>>>>>>>>>>> formatt

Re: [VOTE] Release Apache Flink 1.0.2 (RC3)

2016-04-20 Thread Chiwan Park

AFAIK, FLINK-3701 is about Flink 1.1-SNAPSHOT, not Flink 1.0. We can go forward.

Regards,
Chiwan Park

> On Apr 20, 2016, at 9:33 PM, Trevor Grant <trevor.d.gr...@gmail.com> wrote:
> 
> -1
> 
> Not a PMC so my down vote doesn't mean anything but...
> 
> https://github.com/apache/flink/pull/1913
> 
> https://issues.apache.org/jira/browse/FLINK-3701
> 
> A busted scala shell is a blocker imho.
> 
> 
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
> 
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> 
> 
> On Wed, Apr 20, 2016 at 6:51 AM, Aljoscha Krettek <aljos...@apache.org>
> wrote:
> 
>> +1
>> 
>> I eyeballed the changes and nothing looks suspicious.
>> 
>> On Wed, 20 Apr 2016 at 13:21 Fabian Hueske <fhue...@gmail.com> wrote:
>> 
>>> Thanks Ufuk for preparing the RC.
>>> 
>>> - Checked the diff against release 1.0.1. No dependencies were added or
>>> modified.
>>> - Checked signatures and hashes of all release artifacts.
>>> 
>>> +1 to release this RC.
>>> 
>>> Thanks, Fabian
>>> 
>>> 2016-04-19 21:34 GMT+02:00 Robert Metzger <rmetz...@apache.org>:
>>> 
>>>> Thank you for creating another bugfix release of the 1.0 release Ufuk!
>>>> 
>>>> +1 for releasing this proposed RC.
>>>> 
>>>> 
>>>> - Checked some flink-dist jars for correctly shaded guava classes
>>>> - Started Flink in local mode and ran some examples
>>>> - Checked the staging repository
>>>>  - Checked the quickstarts for the scala 2.10 and the right hadoop
>>>> profiles.
>>>> 
>>>> 
>>>> 
>>>> On Mon, Apr 18, 2016 at 5:29 PM, Ufuk Celebi <u...@apache.org> wrote:
>>>> 
>>>>> Dear Flink community,
>>>>> 
>>>>> Please vote on releasing the following candidate as Apache Flink
>>> version
>>>>> 1.0.2.
>>>>> 
>>>>> The commit to be voted on:
>>>>> d39af152a166ddafaa2466cdae82695880893f3e
>>>>> 
>>>>> Branch:
>>>>> release-1.0.2-rc3 (see
>>>>> 
>>>>> 
>>>> 
>>> 
>> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-1.0.2-rc3
>>>>> )
>>>>> 
>>>>> The release artifacts to be voted on can be found at:
>>>>> http://home.apache.org/~uce/flink-1.0.2-rc3/
>>>>> 
>>>>> The release artifacts are signed with the key with fingerprint
>>> 9D403309:
>>>>> http://www.apache.org/dist/flink/KEYS
>>>>> 
>>>>> The staging repository for this release can be found at:
>>>>> 
>> https://repository.apache.org/content/repositories/orgapacheflink-1092
>>>>> 
>>>>> -
>>>>> 
>>>>> The vote is open for the next 72 hours and passes if a majority of at
>>>> least
>>>>> three +1 PMC votes are cast.
>>>>> 
>>>>> The vote ends on Thursday April 21, 2016.
>>>>> 
>>>>> [ ] +1 Release this package as Apache Flink 1.0.2
>>>>> [ ] -1 Do not release this package because ...
>>>>> 
>>>>> ===
>>>>> 
>>>>> The following commits have been added since the 1.0.1 release
>>> (excluding
>>>>> docs),
>>>>> most notably a performance optimization for the RocksDB state backend
>>> and
>>>>> a fix
>>>>> for proper passing of dynamic YARN properties to the Client.
>>>>> 
>>>>> * 5987eb6 - [FLINK-3657] [dataSet] Change access of
>>>>> DataSetUtils.countElements() to 'public' (5 hours ago) 
>>>>> * b4b08ca - [FLINK-3762] [core] Enable Kryo reference tracking (3
>> days
>>>>> ago) 
>>>>> * 5b69dd8 - [FLINK-3732] [core] Fix potential null deference in
>>>>> ExecutionConfig#equals() (3 days ago) 
>>>>> * aadc5fa - [FLINK-3760] Fix StateDescriptor.readObject (19 hours
>> ago)
>>>>> 
>>>>> * ea50ed3 - [FLINK-3716] [kafka consumer] Decreasing socket timeout
>> so
>>>>> testFailOnNoBroker() will pass before JUnit timeout (29 hours ago)
>>>>> 
>>>>> * ff38202 - [FLINK-3730] Fix RocksDB Local Directory Initialization
>>>>> (29 hours ago) 
>>>>> * 4f9c198 - [FLINK-3712] Make all dynamic properties available to the
>>>>> CLI frontend (3 days ago) 
>>>>> * 1554c9b - [FLINK-3688] WindowOperator.trigger() does not emit
>>>>> Watermark anymore (6 days ago) 
>>>>> * 43093e3 - [FLINK-3697] Properly access type information for nested
>>>>> POJO key selection (6 days ago) 
>>>>> * 17909aa - [FLINK-3654] Disable Write-Ahead-Log in RocksDB State (7
>>>>> days ago) 
>>>>> * e0dc5c1 - [FLINK-3595] [runtime] Eagerly destroy buffer pools on
>>>>> cancelling (10 days ago) 
>>>>> 
>>>> 
>>> 
>>

Re: GSoC Project Proposal Draft: Code Generation in Serializers

2016-04-18 Thread Chiwan Park

Yes, I know Janino is a pure Java project. I meant if we add Scala code to 
flink-core, we should add Scala dependency to flink-core and it could be 
confusing.

Regards,
Chiwan Park

> On Apr 18, 2016, at 2:49 PM, Márton Balassi <balassi.mar...@gmail.com> wrote:
> 
> Chiwan, just to clarify Janino is a Java project. [1]
> 
> [1] https://github.com/aunkrig/janino
> 
> On Mon, Apr 18, 2016 at 3:40 AM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
>> I prefer to avoid Scala dependencies in flink-core. If flink-core includes
>> Scala dependencies, Scala version suffix (_2.10 or _2.11) should be added.
>> I think that users could be confused.
>> 
>> Regards,
>> Chiwan Park
>> 
>>> On Apr 17, 2016, at 3:49 PM, Márton Balassi <balassi.mar...@gmail.com>
>> wrote:
>>> 
>>> Hi Gábor,
>>> 
>>> I think that adding the Janino dep to flink-core should be fine, as it
>> has
>>> quite slim dependencies [1,2] which are generally orthogonal to Flink's
>>> main dependency line (also it is already used elsewhere).
>>> 
>>> As for mixing Scala code that is used from the Java parts of the same
>> maven
>>> module I am skeptical. We have seen IDE compilation issues with projects
>>> using this setup and have decided that the community-wide potential IDE
>>> setup pain outweighs the individual implementation convenience with
>> Scala.
>>> 
>>> [1]
>>> 
>> https://repo1.maven.org/maven2/org/codehaus/janino/janino-parent/2.7.8/janino-parent-2.7.8.pom
>>> [2]
>>> 
>> https://repo1.maven.org/maven2/org/codehaus/janino/janino/2.7.8/janino-2.7.8.pom
>>> 
>>> On Sat, Apr 16, 2016 at 5:51 PM, Gábor Horváth <xazax@gmail.com>
>> wrote:
>>> 
>>>> Hi!
>>>> 
>>>> Table API already uses code generation and the Janino compiler [1]. Is
>> it a
>>>> dependency that is ok to add to flink-core? In case it is ok, I think I
>>>> will use the same in order to be consistent with the other code
>> generation
>>>> efforts.
>>>> 
>>>> I started to look at the Table API code generation [2] and it uses Scala
>>>> extensively. There are several Scala features that can make Java code
>>>> generation easier such as pattern matching and string interpolation. I
>> did
>>>> not see any Scala code in flink-core yet. Is it ok to implement the code
>>>> generation inside the flink-core using Scala?
>>>> 
>>>> Regards,
>>>> Gábor
>>>> 
>>>> [1] http://unkrig.de/w/Janino
>>>> [2]
>>>> 
>>>> 
>> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
>>>> 
>>>> On 18 March 2016 at 19:37, Gábor Horváth <xazax@gmail.com> wrote:
>>>> 
>>>>> Thank you! I finalized the project.
>>>>> 
>>>>> 
>>>>> On 18 March 2016 at 10:29, Márton Balassi <balassi.mar...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Thanks Gábor, now I also see it on the internal GSoC interface. I have
>>>>>> indicated that I wish to mentor your project, I think you can hit
>>>> finalize
>>>>>> on your project there.
>>>>>> 
>>>>>> On Mon, Mar 14, 2016 at 11:16 AM, Gábor Horváth <xazax@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I have updated this draft to include preliminary benchmarks,
>> mentioned
>>>>>> the
>>>>>>> interaction of annotations with savepoints, extended it with a
>>>> timeline,
>>>>>>> and some notes about scala case classes.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Gábor
>>>>>>> 
>>>>>>> On 9 March 2016 at 16:12, Gábor Horváth <xazax@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Hi!
>>>>>>>> 
>>>>>>>> As far as I can see the formatting was not correct in my previous
>>>>>> mail. A
>>>>>>>> better formatted version is available here:
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>> https://docs.google.com

Re: returns method in scala api

2016-04-04 Thread Chiwan Park

Note that you should use `createTypeInfomation[T]` method in 
`org.apache.flink.api.scala` package object to create `TypeInformation` for 
Scala specific types such as case classes or tuples.

Regards,
Chiwan Park

> On Apr 5, 2016, at 1:53 AM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi!
> 
> In Scala, the return TypeInformation is passed as an implicit parameter.
> You can always explicitly pass an implicit parameter, that would be the way
> to pass the return type explicitly in Scala.
> 
> To create the TypeInformation, use the TypeInformation.of(...) methods.
> 
> Hope that helps!
> 
> Stephan
> 
> 
> 
> On Mon, Apr 4, 2016 at 3:08 PM, Judit Fehér <feh...@gmail.com> wrote:
> 
>> Hi,
>> 
>> I'm writing a custom serializer so I need to call the
>> returns(TypeInformation) method after a reduce method, but it isn't
>> available in the scala api, only in the java api.
>> Do you have any suggestions how to go around this and still call the
>> correct deserializer?
>> Thanks!
>> 
>> Judit
>>

Re: a typical ML algorithm flow

2016-03-27 Thread Chiwan Park

Hi Dmitriy,

I think you can implement it with iterative API with custom convergence 
criterion. You can express the convergence criterion by two methods. One is 
using a convergence criterion data set [1][2] and the other is registering an 
aggregator with custom implementation of `ConvergenceCriterion` interface [3].

Here is an example using a convergence criterion data set in Scala API:

```
package flink.sample

import org.apache.flink.api.scala._

import scala.util.Random

object SampleApp extends App {
  val env = ExecutionEnvironment.getExecutionEnvironment

  val data = env.fromElements[Double](1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

  val result = data.iterateWithTermination(5000) { prev =>
// calculate sub solution
val rand = Random.nextDouble()
val subSolution = prev.map(_ * rand)

// calculate convergent condition
val convergence = subSolution.reduce(_ + _).map(_ / 10).filter(_ > 8)

(subSolution, convergence)
  }

  result.print()
}
```

Regards,
Chiwan Park

[1]: 
https://ci.apache.org/projects/flink/flink-docs-release-1.0/api/java/org/apache/flink/api/java/operators/IterativeDataSet.html#closeWith%28org.apache.flink.api.java.DataSet,%20org.apache.flink.api.java.DataSet%29
[2]: iterateWithTermination method in 
https://ci.apache.org/projects/flink/flink-docs-release-1.0/api/scala/index.html#org.apache.flink.api.scala.DataSet
[3]: 
https://ci.apache.org/projects/flink/flink-docs-release-1.0/api/java/org/apache/flink/api/java/operators/IterativeDataSet.html#registerAggregationConvergenceCriterion%28java.lang.String,%20org.apache.flink.api.common.aggregators.Aggregator,%20org.apache.flink.api.common.aggregators.ConvergenceCriterion%29

> On Mar 26, 2016, at 2:51 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> 
> Thank you, all :)
> 
> yes, that's my question. How do we construct such a loop with a concrete
> example?
> 
> Let's take something nonsensical yet specific.
> 
> Say, in samsara terms we do something like that :
> 
> var avg = Double.PositiveInfinity
> var drmA = ... (construct elsewhere)
> 
> 
> 
> do {
>   avg = drmA.colMeans.mean // average of col-wise means
>   drmA = drmA - avg // elementwise subtract of average
> 
> } while (avg > 1e-10)
> 
> (which probably does not converge in reality).
> 
> How would we implement that with native iterations in flink?
> 
> 
> 
> On Wed, Mar 23, 2016 at 2:50 AM, Till Rohrmann <trohrm...@apache.org> wrote:
> 
>> Hi Dmitriy,
>> 
>> I’m not sure whether I’ve understood your question correctly, so please
>> correct me if I’m wrong.
>> 
>> So you’re asking whether it is a problem that
>> 
>> stat1 = A.map.reduce
>> A = A.update.map(stat1)
>> 
>> are executed on the same input data set A and whether we have to cache A
>> for that, right? I assume you’re worried that A is calculated twice.
>> 
>> Since you don’t have a API call which triggers eager execution of the data
>> flow, the map.reduce and map(stat1) call will only construct the data flow
>> of your program. Both operators will depend on the result of A which is
>> only once calculated (when execute, collect or count is called) and then
>> sent to the map.reduce and map(stat1) operator.
>> 
>> However, it is not recommended using an explicit loop to do iterative
>> computations with Flink. The problem here is that you will basically unroll
>> the loop and construct a long pipeline with the operations of each
>> iterations. Once you execute this long pipeline you will face considerable
>> memory fragmentation, because every operator will get a proportional
>> fraction of the available memory assigned. Even worse, if you trigger the
>> execution of your data flow to evaluate the convergence criterion, you will
>> execute for each iteration the complete pipeline which has been built up so
>> far. Thus, you’ll end up with a quadratic complexity in the number of
>> iterations. Therefore, I would highly recommend using Flink’s built in
>> support for native iterations which won’t suffer from this problem or to
>> materialize at least for every n iterations the intermediate result. At the
>> moment this would mean to write the data to some sink and then reading it
>> from there again.
>> 
>> I hope this answers your question. If not, then don’t hesitate to ask me
>> again.
>> 
>> Cheers,
>> Till
>> 
>> 
>> On Wed, Mar 23, 2016 at 10:19 AM, Theodore Vasiloudis <
>> theodoros.vasilou...@gmail.com> wrote:
>> 
>>> Hello Dmitriy,
>>> 
>>> If I understood correctly what you are basically talking about modifying
>> a
>>> DataSet as you iterate over it.
>>> 
>

Re: [DISCUSS] Release 1.0.1 Bugfix release

2016-03-23 Thread Chiwan Park

+1

Regards,
Chiwan Park

> On Mar 23, 2016, at 11:24 PM, Robert Metzger <rmetz...@apache.org> wrote:
> 
> +1
> 
> I just went through the master and release-1.0 branch, and most important
> fixes are already in the release-1.0 branch.
> I would also move this commit into the release branch:
> "[FLINK-3636] Add ThrottledIterator to WindowJoin jar"
> https://github.com/apache/flink/commit/f09d68a05efb4afeb7b8498d35201e8324d6c096
> 
> From the pull requests I would include:
> "[FLINK-3651] Fix faulty RollingSink Restore"
> https://github.com/apache/flink/pull/1830
> 
> Anything else?
> 
> I would like to manage the release, but I'm on vacation next week, so I
> think somebody else should do it ;)
> 
> 
> On Wed, Mar 23, 2016 at 2:18 PM, Maximilian Michels <m...@apache.org> wrote:
> 
>> +1
>> 
>> On Wed, Mar 23, 2016 at 12:19 PM, Till Rohrmann <trohrm...@apache.org>
>> wrote:
>>> +1
>>> 
>>> On Wed, Mar 23, 2016 at 11:24 AM, Stephan Ewen <se...@apache.org> wrote:
>>> 
>>>> Yes, there is also the Rich Scala Window Functions, and the tests that
>> used
>>>> to address wrong JAR directories.
>>>> 
>>>> On Wed, Mar 23, 2016 at 11:15 AM, Ufuk Celebi <u...@apache.org> wrote:
>>>> 
>>>>> Big +1, let's get this rolling... ;)
>>>>> 
>>>>> On Wed, Mar 23, 2016 at 11:14 AM, Aljoscha Krettek <
>> aljos...@apache.org>
>>>>> wrote:
>>>>>> Hi,
>>>>>> I’m aware of one critical fix and one somewhat critical fix since
>>>> 1.0.0.
>>>>> One concerns data loss in the RollingSink, the other is a bug in a
>> window
>>>>> trigger. I would like to release a bugfix release since some people
>> are
>>>>> restricted to using released versions and are also depending on the
>>>>> RollingSink. For them, now having a bugfix release would be a blocker.
>>>>>> 
>>>>>> What do you think? Are there any other critical bugs/fixes that we
>> are
>>>>> aware of?
>>>>>> 
>>>>>> Best,
>>>>>> Aljoscha
>>>>> 
>>>> 
>>

[jira] [Created] (FLINK-3645) HDFSCopyUtilitiesTest fails in a Hadoop cluster

2016-03-22 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-3645:
--

 Summary: HDFSCopyUtilitiesTest fails in a Hadoop cluster
 Key: FLINK-3645
 URL: https://issues.apache.org/jira/browse/FLINK-3645
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.1.0
Reporter: Chiwan Park
Assignee: Chiwan Park
Priority: Minor


{{HDFSCopyUtilitiesTest}} class tests {{HDFSCopyFromLocal.copyFromLocal}} and 
{{HDFSCopyToLocal.copyToLocal}} methods. This test fails when runs on a machine 
where Hadoop is installed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: YARN/Flink Job

2016-03-19 Thread Chiwan Park

Hi Vijay,

Yes, you are right. Flink services (JM & TM) are stopped (not killed) 
immediately after the job execution.

Regards,
Chiwan Park

> On Mar 18, 2016, at 7:57 AM, Vijay Srinivasaraghavan 
> <vijikar...@yahoo.com.INVALID> wrote:
> 
> If I start a flink job on YARN with below option, does Flink (JM & TM) 
> service gets killed after the job execution is complete? In otherwords, what 
> is the lifetime of the Flink service after the job is complete?  
> 
> Run a single Flink job on YARN
> The documentation above describes how to start a Flink cluster within a 
> Hadoop YARN environment. It is also possible to launch Flink within YARN only 
> for executing a single job.Please note that the client then expects the -yn 
> value to be set (number of TaskManagers).Example:./bin/flink run -m 
> yarn-cluster -yn 2 ./examples/batch/WordCount.jar
> RegardsVijay

Re: [VOTE] Release Apache Flink 1.0.0 (RC5)

2016-03-04 Thread Chiwan Park

AFAIK, you should run `tools/change-scala-version.sh 2.11` before running `mvn 
clean install -DskipTests -Dscala-2.11`.

Regards,
Chiwan Park

> On Mar 4, 2016, at 7:20 PM, Stephan Ewen <se...@apache.org> wrote:
> 
> Sorry, the flag is "-Dscala-2.11"
> 
> On Fri, Mar 4, 2016 at 11:19 AM, Stephan Ewen <se...@apache.org> wrote:
> 
>> Hi!
>> 
>> To compile with Scala 2.11, please use the "-Dscala.version=2.11" flag.
>> Otherwise the 2.11 specific build profiles will not get properly activated.
>> 
>> Can you try that again?
>> 
>> Thanks,
>> Stephan
>> 
>> 
>> On Fri, Mar 4, 2016 at 11:17 AM, Stefano Baghino <
>> stefano.bagh...@radicalbit.io> wrote:
>> 
>>> I won't cast a vote as I'm not entirely sure this is just a local problem
>>> (and from the document the Scala 2.11 build has been checked), however
>>> I've
>>> checked out the `release-1.0-rc5` branch and ran `mvn clean install
>>> -DskipTests -Dscala.version=2.11.7`, with a failure on `flink-runtime`:
>>> 
>>> [ERROR]
>>> 
>>> /Users/Stefano/Projects/flink/flink-runtime/src/test/scala/org/apache/flink/runtime/jobmanager/JobManagerITCase.scala:703:
>>> error: can't expand macros compiled by previous versions of Scala
>>> [ERROR]   assert(cachedGraph2.isArchived)
>>> [ERROR]   ^
>>> [ERROR] one error found
>>> 
>>> Is the 2.11 build still compiling successfully according to your latest
>>> tests?
>>> I've tried running a clean and re-running without skipping the tests but
>>> the issue persists.
>>> 
>>> On Fri, Mar 4, 2016 at 10:38 AM, Stephan Ewen <se...@apache.org> wrote:
>>> 
>>>> +1
>>>> 
>>>> Checked LICENSE and NOTICE files
>>>> Built against Hadoop 2.6, Scala 2.10, all tests are good
>>>> Run local pseudo cluster with examples
>>>> Log files look good, no exceptions
>>>> Tested File State Backend
>>>> Ran Storm Compatibility Examples
>>>>   -> minor issue, one example fails (no release blocker in my opinion)
>>>> 
>>>> 
>>>> On Thu, Mar 3, 2016 at 5:41 PM, Till Rohrmann <trohrm...@apache.org>
>>>> wrote:
>>>> 
>>>>> +1
>>>>> 
>>>>> Checked that the sources don't contain binaries
>>>>> Tested cluster execution with flink/run and web client job submission
>>>>> Run all examples via FliRTT
>>>>> Tested Kafka 0.9
>>>>> Verified that quickstarts work with Eclipse and IntelliJ
>>>>> Run example with RemoteEnvironment
>>>>> Verified SBT quickstarts
>>>>> 
>>>>> On Thu, Mar 3, 2016 at 3:43 PM, Aljoscha Krettek <aljos...@apache.org
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> +1
>>>>>> 
>>>>>> I think we have a winner. :D
>>>>>> 
>>>>>> The “boring” tests from the checklist should still hold for this RC
>>>> and I
>>>>>> now ran a custom windowing job with state on RocksDB on Hadoop 2.7
>>> with
>>>>>> Scala 2.11. I used the Yarn HA mode and shot down both JobManagers
>>> and
>>>>>> TaskManagers and the job restarted successfully. I also verified
>>> that
>>>>>> savepoints work in this setup.
>>>>>> 
>>>>>>> On 03 Mar 2016, at 14:08, Robert Metzger <rmetz...@apache.org>
>>>> wrote:
>>>>>>> 
>>>>>>> Apparently I was not careful enough when writing the email.
>>>>>>> The release branch is "release-1.0.0-rc5" and its the fifth RC.
>>>>>>> 
>>>>>>> On Thu, Mar 3, 2016 at 2:01 PM, Robert Metzger <
>>> rmetz...@apache.org>
>>>>>> wrote:
>>>>>>> 
>>>>>>>> Dear Flink community,
>>>>>>>> 
>>>>>>>> Please vote on releasing the following candidate as Apache Flink
>>>>> version
>>>>>>>> 1.0.0.
>>>>>>>> 
>>>>>>>> This is the fourth RC.
>>>>>>>> Here is a document to report on the testing and release
>>>> verification:
>>>>>>>> 
>>>>>> 
>>>>> 
>

Re: Opening a discussion on FlinkML

2016-02-12 Thread Chiwan Park

Hi,

I agree what Theo said. Currently, only few committers spend time to review PRs 
about FlinkML. But I also agree Fabian’s opinion. I would like to keep FlinkML 
under main repository of Flink. I hope new committers spending time for FlinkML.

About Simone’s opinion, yes, FlinkML is still immature ML library. There is a 
lack of many useful features and some of the features are pending in pull 
requests.

Integration with some other libraries such as Mahout, H2O, Weka would be also 
good. Already there are some attempts using Flink or other distributed data 
processing framework as a backend of other library [1] [2] [3]. But I think, as 
you can see the link, we have to re-implement many algorithms even though we 
integrate other library with Flink. I doubt if there is a big development 
advantage of integration.

[1]: https://issues.apache.org/jira/browse/MAHOUT-1570
[2]: http://mahout.apache.org/users/basics/algorithms.html
[3]: https://github.com/ariskk/distributedWekaSpark

Regards,
Chiwan Park

> On Feb 12, 2016, at 7:04 PM, Fabian Hueske <fhue...@gmail.com> wrote:
> 
> Hi Theo,
> 
> thanks for starting this discussion. You are certainly right that the
> development of FlinkML is stalling. On the other hand, we regularly see
> people on the mailing list asking for feature.
> 
> Regarding your proposed ways to proceed:
> 
> 1) I am not sure how much it would help to move FlinkML to a separate
> repository.
> We have discussed to move connectors (and libraries) to separate
> repositories before but the thread fall asleep [1].
> We would still need committers to spend time with reviewing, merging, and
> contributing.
> So IMO, this is orthogonal to having more committer involvement.
> 
> 2) Having committers (current /  new ones) spending time on FlinkML is the
> requirement for keep it alive within the Flink project.
> Adding new committers is kind of a bootstrap problem here because it is
> hard for contributors to get involved with FlinkML if very little committer
> time is spend on code reviews and merging. Nonetheless, I see this as the
> best option.
> 
> 3) Forking of a project on Github is certainly possible (even without the
> endorsement of the Flink community). However, merging changes back into
> Flink would again require a committer to review and merge (probably a much
> larger chunk of code) and also require the permission of all contributors.
> 
> Best,
> Fabian
> 
> [1]
> https://mail-archives.apache.org/mod_mbox/flink-dev/201512.mbox/%3CCAGco--aZhZhrrSzzPROwXwmtYmD5CkoGKe7xNCWG1Vw7V-D%2BaA%40mail.gmail.com%3E
> 
> 2016-02-12 10:23 GMT+01:00 Theodore Vasiloudis <
> theodoros.vasilou...@gmail.com>:
> 
>> Hello all,
>> 
>> I would like to get a conversation started on how we plan to move forward
>> with FlinkML.
>> 
>> Development on the library currently has been mostly dormant for the past 6
>> months,
>> 
>> mainly I believe because of the lack of available committers to review PRs.
>> 
>> Last month we got together with Till and Marton and talked about how we
>> could try to
>> 
>> solve this and ensure continued development of the library.
>> 
>> We see 3 possible paths we could take:
>> 
>>   1.
>> 
>>   Externalize the library, creating a new repository under the Apache
>>   Flink project. This decouples the development of FlinkML from the Flink
>>   release cycle, allowing us to move faster and incorporate new features
>> as
>>   they become available. As FlinkML is a library under development tying
>> it
>>   to specific versions does not make much sense anyway. The library would
>>   depend on the latest snapshot version of Flink. It would then be
>> possible
>>   for the Flink distribution to cherry-pick parts of the library to be
>>   included with the core distribution.
>>   2.
>> 
>>   Keep the development under the main Flink project but bring in new
>>   committers. This would mean that the development remains as is and is
>> tied
>>   to core Flink releases, but new worked should get merged at much more
>>   regular intervals through the help of committers other than Till. Marton
>>   Balassi has volunteered for that role and I hope that more might take up
>>   that role.
>>   3. A third option is to fork FlinkML on a repository on which we are
>>   able to commit freely (again through PRs and reviews of course) and
>> merge
>>   good parts back into the main repo once in a while. This allows for
>> faster
>>   progress and more experimental work but obviously creates fragmentation.
>> 
>> 
>> I would like to hear your thoughts on these three options, as well as
>> discuss other
>> 
>> alternatives that could help move FlinkML forward.
>> 
>> Cheers,
>> Theodore
>>

Re: Want Flink startup issues :-)

2016-02-06 Thread Chiwan Park

Hi Dongwon,

Yes, the things to do are picking an issue (by assigning the issue to you or 
commenting on the issue) and make changes and send a pull request for it.

Welcome! :)

Regards,
Chiwan Park

> On Feb 6, 2016, at 3:31 PM, Dongwon Kim <eastcirc...@postech.ac.kr> wrote:
> 
> Hi Fabian, Matthias, Robert!
> 
> Thank you for welcoming me to the community :-)
> I'm taking a look at JIRA and "How to contribute" as you guys suggested.
> One trivial question is whether I just need to make a pull request
> after figuring out issues?
> Then I'll pick up any issue, figure it out, and then make a pull
> request by myself ;-)
> 
> Meanwhile, I also read the roadmap and I find few plans capturing my interest.
> - Making YARN resource dynamic
> - DataSet API Enhancements
> - Expose more runtime metrics
> Would any of you informs me of new or existing issues regarding the above?
> 
> Thanks!
> 
> Dongwon
> 
> 2016-02-06 4:55 GMT+09:00 Fabian Hueske <fhue...@gmail.com>:
>> Hi Dongwon,
>> 
>> welcome to the Flink mailing list!
>> What kind of issues are you interested in?
>> 
>> - API / library features: DataSet API, DataStream API, SQL, StreamSQL,
>> Graphs (Gelly)
>> - Processing runtime: Batch, Streaming
>> - Connectors to other systems: Stream sources/sinks
>> - Web dashboard
>> - Compatibility: Storm, Hadoop
>> 
>> You can also have a look into Flink's issue tracker JIRA [1]. Right now, we
>> have about 600 issues listed with any kind of difficulty and effort.
>> If you find an issue that sounds interesting, just drop a note and we can
>> give you some details about if you want to learn more.
>> 
>> Best, Fabian
>> 
>> [1]
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20resolution%20%3D%20Unresolved
>> 
>> 2016-02-05 17:14 GMT+01:00 Dongwon Kim <eastcirc...@postech.ac.kr>:
>> 
>>> Hello,
>>> 
>>> I'm Dongwon Kim and I want to get involved in Flink community.
>>> Can anyone guide me through contributing to Flink with some startup issues?
>>> Although my research interest lie in big data systems including Flink,
>>> Spark, MapReduce, and Tez, I've never participated in open source
>>> communities.
>>> 
>>> FYI, I've done the following things for past few years:
>>> - I've studied Apache Hadoop (MRv1, MRv2, and YARN), Apache Tez, and
>>> Apache Spark through the source code.
>>> - My doctoral thesis is about improving the performance of MRv1 by
>>> making network pipelines between mappers and reducers like what Flink
>>> does.
>>> - I've used Ganglia to monitor the cluster performance and I've been
>>> interested in metrics and counters in big data systems.
>>> - I gave a talk named "a comparative performance evaluation of Flink"
>>> at last Flink Forward.
>>> 
>>> I would be very appreciated if someone can help me get involved in the
>>> most promising ASF project :-)
>>> 
>>> Greetings,
>>> Dongwon Kim
>>>

[jira] [Created] (FLINK-3330) Add SparseVector support to BLAS library in FlinkML

2016-02-03 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-3330:
--

 Summary: Add SparseVector support to BLAS library in FlinkML
 Key: FLINK-3330
 URL: https://issues.apache.org/jira/browse/FLINK-3330
 Project: Flink
  Issue Type: Improvement
  Components: Machine Learning Library
Affects Versions: 1.0.0
Reporter: Chiwan Park
Assignee: Chiwan Park


An user reported the problem using {{GradientDescent}} algorithm with 
{{SparseVector}}. 
(http://mail-archives.apache.org/mod_mbox/flink-user/201602.mbox/%3CCAMJxVsiNRy_B349tuRpC%2BY%2BfyW7j2SHcyVfhqnz3BGOwEHXHpg%40mail.gmail.com%3E)

It seems lack of SparseVector support in {{BLAS.axpy}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [ANNOUNCE] Chengxiang Li added as committer

2016-01-19 Thread Chiwan Park

Congrats! Welcome Chengxiang Li!

> On Jan 19, 2016, at 7:13 PM, Vasiliki Kalavri <vasilikikala...@gmail.com> 
> wrote:
> 
> Congratulations! Welcome Chengxiang Li!
> 
> On 19 January 2016 at 11:02, Fabian Hueske <fhue...@gmail.com> wrote:
> 
>> Hi everybody,
>> 
>> I'd like to announce that Chengxiang Li accepted the PMC's offer to become
>> a committer of the Apache Flink project.
>> 
>> Please join me in welcoming Chengxiang Li!
>> 
>> Best, Fabian
>> 

Regards,
Chiwan Park

Re: Flink ML Vector and DenseVector

2016-01-18 Thread Chiwan Park

Hi Hilmi,

In NLP, which types are used for vector values? I think we can cover typical 
case using double values.

> On Jan 18, 2016, at 9:19 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de> wrote:
> 
> Hi,
> the Vector and DenseVector implementations of Flink ML only allow Double 
> values. But there are cases where the values are not Doubles, e.g. in NLP. 
> Does it make sense to make the implementations generic, i.e. Vector[T] and 
> DenseVector[T]?
> 
> Best Regards,
> Hilmi
> 
> -- 
> ==
> Hilmi Yildirim, M.Sc.
> Researcher
> 
> DFKI GmbH
> Intelligente Analytik für Massendaten
> DFKI Projektbüro Berlin
> Alt-Moabit 91c
> D-10559 Berlin
> Phone: +49 30 23895 1814
> 
> E-Mail: hilmi.yildi...@dfki.de
> 
> -
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
> 
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
> 
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
> 
> Amtsgericht Kaiserslautern, HRB 2313
> -
> 

Regards,
Chiwan Park

Re: Flink ML Vector and DenseVector

2016-01-18 Thread Chiwan Park

How about mapping a number for each string? Maybe you can do it with custom 
Transformer.

> On Jan 19, 2016, at 12:02 AM, Hilmi Yildirim <hilmi.yildi...@dfki.de> wrote:
> 
> Ok. In this case I will use an Array instead.
> 
> Am 18.01.2016 um 14:56 schrieb Theodore Vasiloudis:
>> I agree with Till, the data types are different here so you need a custom
>> string vector.
>> 
>> The Vector abstraction in FlinkML is designed with numerical vectors in
>> mind.
>> 
>> On Mon, Jan 18, 2016 at 2:33 PM, Till Rohrmann <trohrm...@apache.org> wrote:
>> 
>>> Hi Hilmi,
>>> 
>>> I think in your case it makes sense to define a custom vector of strings.
>>> The easiest implementation could be an Array[String] or List[String].
>>> 
>>> The reason why it does not make so much sense to make Vector and
>>> DenseVector
>>> generic is that these types are algebraic data types. How would you define
>>> algebraic operations such as scalar product, outer product, multiplication,
>>> etc. on a vector of strings? Then you would have to provide different
>>> implementations for the different type parameters.
>>> 
>>> Cheers,
>>> Till
>>> 
>>> 
>>> On Mon, Jan 18, 2016 at 1:40 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de>
>>> wrote:
>>> 
>>>> Hi,
>>>> how I explained it in a previous E-Mail, I need a LabeledVector where the
>>>> label is also a vector. After we discussed this issue, I created a new
>>>> class named LabeledSequenceVector with the labels as a Vector. In my use
>>>> case, I want to train a POS-Tagger system, so the "vector" is a vector of
>>>> strings and the "labels" is also a vector of strings. If I use the Flink
>>>> Vector/DenseVector implementation then the vector does only have double
>>>> values but I need String values.
>>>> 
>>>> Best Regards,
>>>> Hilmi
>>>> 
>>>> 
>>>> Am 18.01.2016 um 13:33 schrieb Chiwan Park:
>>>> 
>>>>> Hi Hilmi,
>>>>> 
>>>>> In NLP, which types are used for vector values? I think we can cover
>>>>> typical case using double values.
>>>>> 
>>>>> On Jan 18, 2016, at 9:19 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de>
>>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>> the Vector and DenseVector implementations of Flink ML only allow
>>> Double
>>>>>> values. But there are cases where the values are not Doubles, e.g. in
>>> NLP.
>>>>>> Does it make sense to make the implementations generic, i.e. Vector[T]
>>> and
>>>>>> DenseVector[T]?
>>>>>> 
>>>>>> Best Regards,
>>>>>> Hilmi
>>>>>> 
>>>>>> --
>>>>>> ==
>>>>>> Hilmi Yildirim, M.Sc.
>>>>>> Researcher
>>>>>> 
>>>>>> DFKI GmbH
>>>>>> Intelligente Analytik für Massendaten
>>>>>> DFKI Projektbüro Berlin
>>>>>> Alt-Moabit 91c
>>>>>> D-10559 Berlin
>>>>>> Phone: +49 30 23895 1814
>>>>>> 
>>>>>> E-Mail: hilmi.yildi...@dfki.de
>>>>>> 
>>>>>> -
>>>>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>>>>>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>>>>>> 
>>>>>> Geschaeftsfuehrung:
>>>>>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>>>>>> Dr. Walter Olthoff
>>>>>> 
>>>>>> Vorsitzender des Aufsichtsrats:
>>>>>> Prof. Dr. h.c. Hans A. Aukes
>>>>>> 
>>>>>> Amtsgericht Kaiserslautern, HRB 2313
>>>>>> -
>>>>>> 
>>>>>> Regards,
>>>>> Chiwan Park
>>>>> 
>>>>> 
> 
> 
> -- 
> ==
> Hilmi Yildirim, M.Sc.
> Researcher
> 
> DFKI GmbH
> Intelligente Analytik für Massendaten
> DFKI Projektbüro Berlin
> Alt-Moabit 91c
> D-10559 Berlin
> Phone: +49 30 23895 1814
> 
> E-Mail: hilmi.yildi...@dfki.de
> 
> -
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
> 
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
> 
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
> 
> Amtsgericht Kaiserslautern, HRB 2313
> -
> 

Regards,
Chiwan Park

Re: [DISCUSS] Remove Combinable Annotation from DataSet API

2016-01-14 Thread Chiwan Park

e program. In worst case, the program
>>> might
>>>> silently produce wrong results (or crash) if the combiner implementation
>>>> was faulty. In best case, the program executes faster.
>>>> 
>>>> 
>>>> 
>>>> Approach 2:
>>>> - As Approach 1
>>>> - In addition extend both combine interfaces with a deprecated marker
>>>> method. This will ensure that all functions that implement a combinable
>>>> interface do not compile anymore and need to be fixed. This could prevent
>>>> the silent failure as in Approach 1, but would also cause an additional
>>> API
>>>> breaking change once the deprecated marker method is removed again.
>>>> 
>>>> 
>>>> 
>>>> Approach 3:
>>>> - Mark Combinable annotation deprecated
>>>> - Mark combine() method in RichGroupReduceFunction as deprecated
>>>> - Effect:
>>>>   - There'll be a couple of deprecation warnings.
>>>>   - We face the same problem with silent failures as in Approach 1.
>>>>   - We have to check if RichGroupReduceFunction's override combine or
>>> not
>>>> (can be done with reflection). If the method is not overridden we do not
>>>> execute it (unless there is a Combinable annotation) and we are fine. If
>>> it
>>>> is overridden and no Combinable annotation has been defined, we have the
>>>> same problem with silent failures as before.
>>>>   - After we remove the deprecated annotation and method, we have the
>>>> same effect as with Approach 1.
>>>> 
>>>> 
>>>> 
>>>> There are more alternatives, but these are the most viable, IMO.
>>>> 
>>>> 
>>>> 
>>>> I think, if we want to remove the combinable annotation, we should do it
>>>> now.
>>>> 
>>>> Given the three options, would go for Approach 1. Yes, breaks a lot of
>>> code
>>>> and yes there is the possibility of computing incorrect results.
>>> Approach 2
>>>> is safer but would mean another API breaking change in the future.
>>> Approach
>>>> 3 comes with fewer breaking changes but has the same problem of silent
>>>> failures.
>>>> 
>>>> IMO, the breaking API changes of Approach 1 are even desirable because
>>> they
>>>> will make users aware that this feature changed.
>>>> 
>>>> 
>>>> 
>>>> What do you think?
>>>> 
>>>> 
>>>> 
>>>> Cheers, Fabian
>>>> 

Regards,
Chiwan Park

Re: Naive question

2016-01-12 Thread Chiwan Park

Hi Ram,

Because there are some Scala IDE (Eclipse) plugins needed, I recommend to avoid 
`mvn eclipse:eclipse` command. Could you try just run `mvn clean install 
-DskipTests` and import the project to Scala IDE directly? In middle of 
importing process, Scala IDE suggests some plugins needed.

And which version of Scala IDE you are using?

> On Jan 12, 2016, at 7:58 PM, Vasudevan, Ramkrishna S 
> <ramkrishna.s.vasude...@intel.com> wrote:
> 
> Yes. I added it as Maven project only. I did mvn eclipse:eclipse to create 
> the project and also built the code using mvn clean install -DskipTests.
> 
> Regards
> Ram
> 
> -Original Message-
> From: ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] On Behalf Of 
> Stephan Ewen
> Sent: Tuesday, January 12, 2016 4:10 PM
> To: dev@flink.apache.org
> Subject: Re: Naive question
> 
> Sorry to hear that it did not work out with Eclipse at all in the end, even 
> with all adjustments.
> 
> Just making sure: You imported Flink as a Maven project, not manually adding 
> the big Flink dependency JAR?
> 
> On Tue, Jan 12, 2016 at 5:15 AM, Vasudevan, Ramkrishna S < 
> ramkrishna.s.vasude...@intel.com> wrote:
> 
>> Thanks to all. I tried with Scala Eclipse IDE with all these 
>> 'change-scala-version.sh'. But in vain.
>> 
>> So I switched over to Intellij and thing work fine over there. I am 
>> new to Intellij so will try using it.
>> 
>> Once again thanks for helping me out.
>> 
>> Regards
>> Ram
>> 
>> -Original Message-
>> From: Chiwan Park [mailto:chiwanp...@apache.org]
>> Sent: Monday, January 11, 2016 4:37 PM
>> To: dev@flink.apache.org
>> Subject: Re: Naive question
>> 
>> Hi Ram,
>> 
>> If you want to build Flink with Scala 2.10, just checkout Flink 
>> repository from github or download source code from homepage, run `mvn 
>> clean install -DskipTests` and import projects to your IDE. If you 
>> want to build Flink with Scala 2.11, you have to run 
>> `tools/change-scala-version.sh 2.11` before build the project. You can 
>> revert Scala version change by running `tools/change-scala-version.sh 2.10`.
>> 
>> About IDE, Flink community recommends IntelliJ IDEA because Scala IDE 
>> have some problems in Java/Scala mixed project like Flink. But I 
>> tested importing Flink project with Scala IDE 4.3.0, Scala 2.11.7 and 
>> Flink 0.10.0 source code. Note that you should import the project as maven 
>> project.
>> 
>> By the way, the community welcomes any questions. Please feel free to 
>> post questions. :)
>> 
>>> On Jan 11, 2016, at 7:30 PM, Vasudevan, Ramkrishna S <
>> ramkrishna.s.vasude...@intel.com> wrote:
>>> 
>>> Thank you very much for the reply.
>>> I tried different ways and when I tried setting up the root pom.xml 
>>> to
>>> 2.11
>>> 
>>>  2.11.6
>>>  2.11
>>> 
>>> I got the following error
>>> [INFO]
>>> 
>>> --
>>> -- [ERROR] Failed to execute goal on project flink-scala: Could not 
>>> resolve depende ncies for project
>>> org.apache.flink:flink-scala:jar:1.0-SNAPSHOT: Could not find 
>>> artifact
>>> org.scalamacros:quasiquotes_2.11:jar:2.0.1 in central 
>>> (http://repo.mave
>>> n.apache.org/maven2) -> [Help 1]
>>> 
>>> If I leave the scala.binary.verson to be at 2.10 and the scala 
>>> version to be at 2.11.6 then I get the following problem [INFO]
>>> C:\flink\flink\flink-runtime\src\test\scala:-1: info: compiling 
>>> [INFO] Compiling 366 source files to 
>>> C:\flink\flink\flink-runtime\target\test-cl
>>> asses at 1452508064750
>>> [ERROR]
>>> C:\flink\flink\flink-runtime\src\test\scala\org\apache\flink\runtime
>>> \j
>>> ob
>>> manager\JobManagerITCase.scala:700: error: can't expand macros 
>>> compiled by previ ous versions of Scala
>>> [ERROR]   assert(cachedGraph2.isArchived)
>>> [ERROR]   ^
>>> 
>>> So am not pretty sure how to proceed with this. If I try to change 
>>> the
>> version of scala to 2.10 in the IDE then I get lot of compilation issues.
>> IS there any way to over come this?
>>> 
>>> Once again thanks a lot and apologies for the naïve question.
>>> 
>>> Regards
>>> Ram
>>> -Original Message-
>>> From: ewenstep...@gmail.com [mailto:ewenstep...@gm

Re: Naive question

2016-01-12 Thread Chiwan Park

Because I tested with Scala IDE 4.3.0 only, the process in the documentation is 
slightly different with my experience.

> On Jan 12, 2016, at 8:21 PM, Stephan Ewen <se...@apache.org> wrote:
> 
> @Chiwan: Is this still up to date from your experience?
> 
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/ide_setup.html
> 
> On Tue, Jan 12, 2016 at 12:04 PM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
>> Hi Ram,
>> 
>> Because there are some Scala IDE (Eclipse) plugins needed, I recommend to
>> avoid `mvn eclipse:eclipse` command. Could you try just run `mvn clean
>> install -DskipTests` and import the project to Scala IDE directly? In
>> middle of importing process, Scala IDE suggests some plugins needed.
>> 
>> And which version of Scala IDE you are using?
>> 
>>> On Jan 12, 2016, at 7:58 PM, Vasudevan, Ramkrishna S <
>> ramkrishna.s.vasude...@intel.com> wrote:
>>> 
>>> Yes. I added it as Maven project only. I did mvn eclipse:eclipse to
>> create the project and also built the code using mvn clean install
>> -DskipTests.
>>> 
>>> Regards
>>> Ram
>>> 
>>> -Original Message-
>>> From: ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] On Behalf Of
>> Stephan Ewen
>>> Sent: Tuesday, January 12, 2016 4:10 PM
>>> To: dev@flink.apache.org
>>> Subject: Re: Naive question
>>> 
>>> Sorry to hear that it did not work out with Eclipse at all in the end,
>> even with all adjustments.
>>> 
>>> Just making sure: You imported Flink as a Maven project, not manually
>> adding the big Flink dependency JAR?
>>> 
>>> On Tue, Jan 12, 2016 at 5:15 AM, Vasudevan, Ramkrishna S <
>> ramkrishna.s.vasude...@intel.com> wrote:
>>> 
>>>> Thanks to all. I tried with Scala Eclipse IDE with all these
>>>> 'change-scala-version.sh'. But in vain.
>>>> 
>>>> So I switched over to Intellij and thing work fine over there. I am
>>>> new to Intellij so will try using it.
>>>> 
>>>> Once again thanks for helping me out.
>>>> 
>>>> Regards
>>>> Ram
>>>> 
>>>> -Original Message-
>>>> From: Chiwan Park [mailto:chiwanp...@apache.org]
>>>> Sent: Monday, January 11, 2016 4:37 PM
>>>> To: dev@flink.apache.org
>>>> Subject: Re: Naive question
>>>> 
>>>> Hi Ram,
>>>> 
>>>> If you want to build Flink with Scala 2.10, just checkout Flink
>>>> repository from github or download source code from homepage, run `mvn
>>>> clean install -DskipTests` and import projects to your IDE. If you
>>>> want to build Flink with Scala 2.11, you have to run
>>>> `tools/change-scala-version.sh 2.11` before build the project. You can
>>>> revert Scala version change by running `tools/change-scala-version.sh
>> 2.10`.
>>>> 
>>>> About IDE, Flink community recommends IntelliJ IDEA because Scala IDE
>>>> have some problems in Java/Scala mixed project like Flink. But I
>>>> tested importing Flink project with Scala IDE 4.3.0, Scala 2.11.7 and
>>>> Flink 0.10.0 source code. Note that you should import the project as
>> maven project.
>>>> 
>>>> By the way, the community welcomes any questions. Please feel free to
>>>> post questions. :)
>>>> 
>>>>> On Jan 11, 2016, at 7:30 PM, Vasudevan, Ramkrishna S <
>>>> ramkrishna.s.vasude...@intel.com> wrote:
>>>>> 
>>>>> Thank you very much for the reply.
>>>>> I tried different ways and when I tried setting up the root pom.xml
>>>>> to
>>>>> 2.11
>>>>> 
>>>>> 2.11.6
>>>>> 2.11
>>>>> 
>>>>> I got the following error
>>>>> [INFO]
>>>>> 
>>>>> --
>>>>> -- [ERROR] Failed to execute goal on project flink-scala: Could not
>>>>> resolve depende ncies for project
>>>>> org.apache.flink:flink-scala:jar:1.0-SNAPSHOT: Could not find
>>>>> artifact
>>>>> org.scalamacros:quasiquotes_2.11:jar:2.0.1 in central
>>>>> (http://repo.mave
>>>>> n.apache.org/maven2) -> [Help 1]
>>>>> 
>>>>> If I leave the scala.binary.verson to be at 2.10

Re: Naive question

2016-01-11 Thread Chiwan Park

   at
>> akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
>> at
>> akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
>> at scala.util.Success.flatMap(Try.scala:230)
>> at
>> akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
>> at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:585)
>> at akka.actor.ActorSystemImpl.(ActorSystem.scala:578)
>> at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
>> at akka.actor.ActorSystem$.apply(ActorSystem.scala:119)
>> at akka.actor.ActorSystem$.create(ActorSystem.scala:67)
>> at
>> org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:84)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.startJobManagerActorSystem(FlinkMiniCluster.scala:196)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.singleActorSystem$lzycompute$1(FlinkMiniCluster.scala:225)
>> at org.apache.flink.runtime.minicluster.FlinkMiniCluster.org
>> $apache$flink$runtime$minicluster$FlinkMiniCluster$$singleActorSystem$1(FlinkMiniCluster.scala:225)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:230)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:228)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>> at scala.collection.immutable.Range.foreach(Range.scala:166)
>> at
>> scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:228)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:219)
>> at
>> org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:104)
>> at
>> org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount
>> .java:80)
>> 
>> I know this is a naïve question but I would like to get some help in 
>> order to over come this issue. I tried various options like setting 
>> scala-2.10 as the compiler for the project (then it shows completely 
>> different error) and many of the projects don't even compile. But with 
>> 2.11 version I get the above stack trace. Any help here is welcome.
>> 
>> Regards
>> Ram
>> 

Regards,
Chiwan Park

Re: Effort to add SQL / StreamSQL to Flink

2016-01-10 Thread Chiwan Park

We still don’t have a concensus about the streaming SQL and CEP library on 
Flink. Some people want to merge these two libraries. Maybe we have to discuss 
about this in mailing list.

> On Jan 11, 2016, at 10:53 AM, Nick Dimiduk <ndimi...@gmail.com> wrote:
> 
> What's the relationship between the streaming SQL proposed here and the CEP
> syntax proposed earlier in the week?
> 
> On Sunday, January 10, 2016, Henry Saputra <henry.sapu...@gmail.com> wrote:
> 
>> Awesome! Thanks for the reply, Fabian.
>> 
>> - Henry
>> 
>> On Sunday, January 10, 2016, Fabian Hueske <fhue...@gmail.com
>> <javascript:;>> wrote:
>> 
>>> Hi Henry,
>>> 
>>> There is https://issues.apache.org/jira/browse/FLINK-2099 and a few
>>> subissues.
>>> I'll reorganize these and add more issues for the tasks described in the
>>> design document in the next days.
>>> 
>>> Thanks, Fabian
>>> 
>>> 2016-01-10 2:45 GMT+01:00 Henry Saputra <henry.sapu...@gmail.com
>> <javascript:;>
>>> <javascript:;>>:
>>> 
>>>> HI Fabian,
>>>> 
>>>> Have you created JIRA ticket to keep track of this new feature?
>>>> 
>>>> - Henry
>>>> 
>>>> On Thu, Jan 7, 2016 at 6:05 AM, Fabian Hueske <fhue...@gmail.com
>> <javascript:;>
>>> <javascript:;>> wrote:
>>>>> Hi everybody,
>>>>> 
>>>>> in the last days, Timo and I refined the design document for adding a
>>>> SQL /
>>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>>> 
>>>>> The document proposes an architecture that is centered around Apache
>>>>> Calcite. Calcite is an Apache top-level project and includes a SQL
>>>> parser,
>>>>> a semantic validator for relational queries, and a rule- and
>> cost-based
>>>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill
>>>>> (among other projects). In a nutshell, the plan is to translate Table
>>> API
>>>>> and SQL queries into Calcite's relational expression trees, optimize
>>>> these
>>>>> trees, and translate them into DataSet and DataStream programs.The
>>>> document
>>>>> breaks down the work into several tasks and subtasks.
>>>>> 
>>>>> Please review the design document and comment.
>>>>> 
>>>>> -- >
>>>>> 
>>>> 
>>> 
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>>>>> 
>>>>> Unless there are major concerns with the design, Timo and I want to
>>> start
>>>>> next week to move the current Table API on top of Apache Calcite
>> (Task
>>> 1
>>>> in
>>>>> the document). The goal of this task is to have the same
>> functionality
>>> as
>>>>> currently, but with Calcite in the translation process. This is a
>>>> blocking
>>>>> task that we hope to complete soon. Afterwards, we can independently
>>> work
>>>>> on different aspects such as extending the Table API, adding a SQL
>>>>> interface (basically just a parser), integration with external data
>>>>> sources, better code generation, optimization rules, streaming
>> support
>>>> for
>>>>> the Table API, StreamSQL, etc..
>>>>> 
>>>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge
>>> it
>>>> to
>>>>> the master branch once the task is completed. Of course, everybody is
>>>>> welcome to contribute to this effort. Please let us know such that we
>>> can
>>>>> coordinate our efforts.
>>>>> 
>>>>> Thanks,
>>>>> Fabian

Regards,
Chiwan Park

Re: Naive question

2016-01-08 Thread Chiwan Park

Hi,

Because I’m not user of Eclipse so I’m not sure but think that IDE Setup 
documentation [1] on Flink homepage could help you.

[1] 
https://ci.apache.org/projects/flink/flink-docs-master/internals/ide_setup.html

> On Jan 8, 2016, at 8:30 PM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi!
> 
> This looks like a mismatch between the Scala dependency in Flink and Scala
> in your Eclipse. Make sure you use the same for both. By default, Flink
> reference Scala 2.10
> 
> If your IDE is set up for Scala 2.11, set the Scala version variable in the
> Flink root pom.xml also to 2.11
> 
> Greetings,
> Stephan
> 
> 
> 
> 
> On Fri, Jan 8, 2016 at 12:06 PM, Vasudevan, Ramkrishna S <
> ramkrishna.s.vasude...@intel.com> wrote:
> 
>> I have been trying to install, learn and understand Flink. I am using
>> Scala- EclipseIDE as my IDE.
>> 
>> I have downloaded the flink source coded, compiled and created the project.
>> 
>> My work laptop is Windows based and I don't have eclipse based workstation
>> but I do have linux boxes for running and testing things.
>> 
>> Some of the examples given in Flink source code do run directly from
>> Eclipse but when I try to run the Wordcount example from Eclipse I get this
>> error
>> 
>> Exception in thread "main" java.lang.NoSuchMethodError:
>> scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
>> at akka.actor.ActorCell$.(ActorCell.scala:336)
>> at akka.actor.ActorCell$.(ActorCell.scala)
>> at akka.actor.RootActorPath.$div(ActorPath.scala:159)
>> at akka.actor.LocalActorRefProvider.(ActorRefProvider.scala:464)
>> at akka.actor.LocalActorRefProvider.(ActorRefProvider.scala:452)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown
>> Source)
>> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
>> Source)
>> at java.lang.reflect.Constructor.newInstance(Unknown Source)
>> at
>> akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$2.apply(DynamicAccess.scala:78)
>> at scala.util.Try$.apply(Try.scala:191)
>> at
>> akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:73)
>> at
>> akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
>> at
>> akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
>> at scala.util.Success.flatMap(Try.scala:230)
>> at
>> akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
>> at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:585)
>> at akka.actor.ActorSystemImpl.(ActorSystem.scala:578)
>> at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
>> at akka.actor.ActorSystem$.apply(ActorSystem.scala:119)
>> at akka.actor.ActorSystem$.create(ActorSystem.scala:67)
>> at
>> org.apache.flink.runtime.akka.AkkaUtils$.createActorSystem(AkkaUtils.scala:84)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.startJobManagerActorSystem(FlinkMiniCluster.scala:196)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.singleActorSystem$lzycompute$1(FlinkMiniCluster.scala:225)
>> at org.apache.flink.runtime.minicluster.FlinkMiniCluster.org
>> $apache$flink$runtime$minicluster$FlinkMiniCluster$$singleActorSystem$1(FlinkMiniCluster.scala:225)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:230)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster$$anonfun$1.apply(FlinkMiniCluster.scala:228)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>> at
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)
>> at scala.collection.immutable.Range.foreach(Range.scala:166)
>> at
>> scala.collection.TraversableLike$class.map(TraversableLike.scala:245)
>> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:228)
>> at
>> org.apache.flink.runtime.minicluster.FlinkMiniCluster.start(FlinkMiniCluster.scala:219)
>> at
>> org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:104)
>> at
>> org.apache.flink.streaming.examples.wordcount.WordCount.main(WordCount.java:80)
>> 
>> I know this is a naïve question but I would like to get some help in order
>> to over come this issue. I tried various options like setting scala-2.10 as
>> the compiler for the project (then it shows completely different error) and
>> many of the projects don't even compile. But with 2.11 version I get the
>> above stack trace. Any help here is welcome.
>> 
>> Regards
>> Ram
>> 

Regards,
Chiwan Park

Re: Effort to add SQL / StreamSQL to Flink

2016-01-07 Thread Chiwan Park

Really good! Many people want to use SQL. :)

> On Jan 8, 2016, at 2:36 AM, Kostas Tzoumas <ktzou...@apache.org> wrote:
> 
> Wow! Thanks Fabian, this looks fantastic!
> 
> On Thu, Jan 7, 2016 at 4:35 PM, Stephan Ewen <se...@apache.org> wrote:
> 
>> Super, thanks for that detailed effort, Fabian!
>> 
>> On Thu, Jan 7, 2016 at 3:40 PM, Matthias J. Sax <mj...@apache.org> wrote:
>> 
>>> Pretty cool!
>>> 
>>> On 01/07/2016 03:05 PM, Fabian Hueske wrote:
>>>> Hi everybody,
>>>> 
>>>> in the last days, Timo and I refined the design document for adding a
>>> SQL /
>>>> StreamSQL interface on top of Flink that was started by Stephan.
>>>> 
>>>> The document proposes an architecture that is centered around Apache
>>>> Calcite. Calcite is an Apache top-level project and includes a SQL
>>> parser,
>>>> a semantic validator for relational queries, and a rule- and cost-based
>>>> relational optimizer. Calcite is used by Apache Hive and Apache Drill
>>>> (among other projects). In a nutshell, the plan is to translate Table
>> API
>>>> and SQL queries into Calcite's relational expression trees, optimize
>>> these
>>>> trees, and translate them into DataSet and DataStream programs.The
>>> document
>>>> breaks down the work into several tasks and subtasks.
>>>> 
>>>> Please review the design document and comment.
>>>> 
>>>> -- >
>>>> 
>>> 
>> https://docs.google.com/document/d/1TLayJNOTBle_-m1rQfgA6Ouj1oYsfqRjPcp1h2TVqdI/edit?usp=sharing
>>>> 
>>>> Unless there are major concerns with the design, Timo and I want to
>> start
>>>> next week to move the current Table API on top of Apache Calcite (Task
>> 1
>>> in
>>>> the document). The goal of this task is to have the same functionality
>> as
>>>> currently, but with Calcite in the translation process. This is a
>>> blocking
>>>> task that we hope to complete soon. Afterwards, we can independently
>> work
>>>> on different aspects such as extending the Table API, adding a SQL
>>>> interface (basically just a parser), integration with external data
>>>> sources, better code generation, optimization rules, streaming support
>>> for
>>>> the Table API, StreamSQL, etc..
>>>> 
>>>> Timo and I plan to work on a WIP branch to implement Task 1 and merge
>> it
>>> to
>>>> the master branch once the task is completed. Of course, everybody is
>>>> welcome to contribute to this effort. Please let us know such that we
>> can
>>>> coordinate our efforts.
>>>> 
>>>> Thanks,
>>>> Fabian
>>>> 
>>> 
>>> 
>> 

Regards,
Chiwan Park

Re: JavaScript Bindings

2016-01-06 Thread Chiwan Park

Really good news!
New language binding would be helpful to extend user pool of Flink.

If you have any questions, please feel free send a email to this mailing list.

> On Jan 7, 2016, at 2:01 AM, Adam Dutko <a...@runbymany.com> wrote:
> 
> I would like to explore adding JavaScript bindings to empower users to
> write code in JavaScript and leverage Apache Flink.
> 
> I'm somewhat familiar with the Python API and have started exploring adding
> support for JavaScript using the Python code as a template. I would like to
> continue explore adding support for JavaScript possibly leveraging some of
> the Nashorn work being done by Oracle.
> 
> I mistakenly posted to an old Github thread ...
> https://github.com/stratosphere/stratosphere/issues/377 ... and rmetzger
> pointed me to this list.
> 
> I'm new to the ASF. After reading the Flink contribution guidelines and
> talking to rmetzger via Github I figured I'd continue the discussion
> through this outlet.
> 
> Any guidance would be much appreciated!
> 
> -- 
> 
> Enjoy life!
> 
> -Adam

Regards,
Chiwan Park

Re: LabeledVector with label vector

2016-01-05 Thread Chiwan Park

Hi Hilmi,

Thanks for suggestion about type of labeled vector. Basically, I agree that 
your suggestion is reasonable. But, I would like to generialize `LabeledVector` 
like following example:

```
case class LabeledVector[T <: Serializable](label: T, vector: Vector) extends 
Serializable {
  // some implementations for LabeledVector
}
```

How about this implementation? If there are any other opinions, please send a 
email to mailing list.

> On Jan 5, 2016, at 7:36 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de> wrote:
> 
> Hi,
> in the ML-Pipeline of Flink we have the "LabeledVector" class. It consists of 
> a vector and a label as a double value. Unfortunately, it is not applicable 
> for sequence learning where the label is also a vector. For example, in NLP 
> we have a vector of words and the label is a vector of the corresponding 
> labels.
> 
> The optimize function of the "Solver" class has a DateSet[LabeledVector] as 
> input and, therefore, it is not applicable for sequence learning. I think the 
> LabeledVector should be adapted that the label is a vector instead of a 
> single Double value. What do you think?
> 
> Best Regards,
> 
> -- 
> ==
> Hilmi Yildirim, M.Sc.
> Researcher
> 
> DFKI GmbH
> Intelligente Analytik für Massendaten
> DFKI Projektbüro Berlin
> Alt-Moabit 91c
> D-10559 Berlin
> Phone: +49 30 23895 1814
> 
> E-Mail: hilmi.yildi...@dfki.de
> 
> -
> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
> 
> Geschaeftsfuehrung:
> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
> Dr. Walter Olthoff
> 
> Vorsitzender des Aufsichtsrats:
> Prof. Dr. h.c. Hans A. Aukes
> 
> Amtsgericht Kaiserslautern, HRB 2313
> -
> 

Regards,
Chiwan Park

Re: LabeledVector with label vector

2016-01-05 Thread Chiwan Park

Hi Theodore,

Thanks for explaining the reason. :)

So how about change LabeledVector contains two vectors? One of vectors is for 
label and the other one is for value. I think this approach would be okay 
because a double value label could be represented as a 
DenseVector(Array(LABEL_VALUE)).

Only problem in this approach is some overhead of processing Vector type in 
case of single double label. If the overhead is significant, we should create 
two types of LabeledVector such as DoubleLabeledVector and VectorLabeledVector.

Which one is preferred? 

> On Jan 5, 2016, at 11:38 PM, Theodore Vasiloudis 
> <theodoros.vasilou...@gmail.com> wrote:
> 
> Generalizing the type of the label for the label vector is an idea we
> played with when designing the current optimization framework.
> 
> We ended up deciding against it as the double type allows us to do
> regressions and (multiclass) classification which should be the majority of
> the use cases out there, while keeping the code simple.
> 
> Generalizing this to [T <: Serializable] is too broad I think. [T <:
> Vector] is I think more reasonable, I cannot think of many cases where the
> label in an optimization problems is something other than a vector/double.
> 
> Any change would require a number of changes in the optimization of course,
> as optimizing for vector and double labels requires different handling of
> error calculation etc but it should be doable.
> Note however that since LabeledVector is such a core part of the library
> any changes would involve a number of adjustments downstream.
> 
> Perhaps having different optimizers etc. for Vectors and double labels
> makes sense, but I haven't put much though into this.
> 
> 
> On Tue, Jan 5, 2016 at 12:17 PM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
>> Hi Hilmi,
>> 
>> Thanks for suggestion about type of labeled vector. Basically, I agree
>> that your suggestion is reasonable. But, I would like to generialize
>> `LabeledVector` like following example:
>> 
>> ```
>> case class LabeledVector[T <: Serializable](label: T, vector: Vector)
>> extends Serializable {
>>  // some implementations for LabeledVector
>> }
>> ```
>> 
>> How about this implementation? If there are any other opinions, please
>> send a email to mailing list.
>> 
>>> On Jan 5, 2016, at 7:36 PM, Hilmi Yildirim <hilmi.yildi...@dfki.de>
>> wrote:
>>> 
>>> Hi,
>>> in the ML-Pipeline of Flink we have the "LabeledVector" class. It
>> consists of a vector and a label as a double value. Unfortunately, it is
>> not applicable for sequence learning where the label is also a vector. For
>> example, in NLP we have a vector of words and the label is a vector of the
>> corresponding labels.
>>> 
>>> The optimize function of the "Solver" class has a DateSet[LabeledVector]
>> as input and, therefore, it is not applicable for sequence learning. I
>> think the LabeledVector should be adapted that the label is a vector
>> instead of a single Double value. What do you think?
>>> 
>>> Best Regards,
>>> 
>>> --
>>> ==
>>> Hilmi Yildirim, M.Sc.
>>> Researcher
>>> 
>>> DFKI GmbH
>>> Intelligente Analytik für Massendaten
>>> DFKI Projektbüro Berlin
>>> Alt-Moabit 91c
>>> D-10559 Berlin
>>> Phone: +49 30 23895 1814
>>> 
>>> E-Mail: hilmi.yildi...@dfki.de
>>> 
>>> ---------
>>> Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
>>> Firmensitz: Trippstadter Strasse 122, D-67663 Kaiserslautern
>>> 
>>> Geschaeftsfuehrung:
>>> Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster (Vorsitzender)
>>> Dr. Walter Olthoff
>>> 
>>> Vorsitzender des Aufsichtsrats:
>>> Prof. Dr. h.c. Hans A. Aukes
>>> 
>>> Amtsgericht Kaiserslautern, HRB 2313
>>> -
>>> 
>> 
>> Regards,
>> Chiwan Park

Regards,
Chiwan Park

Re: Lack of review on PRs

2015-12-06 Thread Chiwan Park

Hi Sachin,

I’m sorry for your unsatisfied experience about lack of review.

As you know, because there are few committers (including me) whose interest is 
ML, the review of PRs could be completed slowly. But I admit my fault about 
leaving the PRs unreviewed for 3-5 months.

I’ll try to review your PRs as soon as possible.

> On Dec 6, 2015, at 1:00 AM, Sachin Goel <sachingoel0...@gmail.com> wrote:
> 
> Hi all
> Sorry about a weekend email.
> 
> This email is to express my displeasure over the lack of any review on my
> PRs on extending the ML library. Five of my PRs have been without any
> review for times varying from 3-5 months now.
> When I took up the task of extending the ML library by implementing core
> algorithms such as Decision Tree and k-means clustering [with several
> initialization schemes], I had hoped that the community will be actively
> involved in it since ML is a very important component of any big data
> system these days. However, it appears I have been wrong.
> Surely, the initial reviews required a lot of changes from my side over
> coding style mistakes [first time programmer in Scala], and
> less-than-optimal implementations. I like to think that I have learned a
> lot about maintaining better coding style compatible with Flink code base,
> and spending time to optimize my work due to this.
> However, if a PR requires work, that doesn't automatically disqualify it
> from being reviewed actively, since the author has spent a lot of time on
> it and has voluntarily taken up the task of contributing.
> 
> Machine learning is my core area of interest and I am able to contribute
> much more to the library; however, a lack of review after repeated
> reminders automatically discourages me from picking up more issues.
> 
> However minor some of my commits maybe, I have been actively involved in
> the development work [with a total of 29 commits.]. I have also spent a lot
> of time in release testing and diagnosing-slash-fixing lots of issues with
> Web Dashboard. However, as with any contributor, my main goal is to
> contribute to my area of interest, while also diversifying my work by
> fixing other issues.
> 
> The PRs are 710, 757, 861, 918 and 1032. I propose the following order for
> anyone who wants to review my work:
> 1032 [very simple feature.]
> 918 [very short PR]
> 861 [followed by 710 after a complete rebase] [major work for Histograms
> and Decision Trees]
> 757 [major work for K-Means clustering and initialization schemes]
> 
> If I have come across as rude, I apologize.
> 
> Happy reviewing and thanks for bearing with me. :)
> 
> Cheers!
> Sachin
> 
> -- Sachin Goel
> Computer Science, IIT Delhi
> m. +91-9871457685

Regards,
Chiwan Park

Re: [VOTE] [RESULT] Release Apache Flink 0.10.0 (release-0.10.0-rc8)

2015-11-13 Thread Chiwan Park

Great. Thanks Max! :)

> On Nov 13, 2015, at 4:06 PM, Vasiliki Kalavri <vasilikikala...@gmail.com> 
> wrote:
> 
> \o/ \o/ \o/
> Thank you Max!
> On Nov 13, 2015 2:23 AM, "Nick Dimiduk" <ndimi...@gmail.com> wrote:
> 
>> Woo hoo!
>> 
>> On Thu, Nov 12, 2015 at 3:01 PM, Maximilian Michels <m...@apache.org>
>> wrote:
>> 
>>> Thanks for voting! The vote passes.
>>> 
>>> The following votes have been cast:
>>> 
>>> +1 votes: 7
>>> 
>>> Stephan
>>> Aljoscha
>>> Robert
>>> Max
>>> Chiwan*
>>> Henry
>>> Fabian
>>> 
>>> * non-binding
>>> 
>>> -1 votes: none
>>> 
>>> I'll upload the release artifacts and release the Maven artifacts.
>>> Once the changes are effective, the community may announce the
>>> release.
>>> 

Regards,
Chiwan Park

Re: Error when building the docs

2015-11-05 Thread Chiwan Park

Hi Martin,

I had the same problem. From my investigation, current custom Jekyll plugin for 
Flink is not compatible with Jekyll 3.x. If you remove Jekyll 3.x and install 
Jekyll 2.x, you can build docs. I’m using Jekyll 2.5.3 to build the docs.

Regards,
Chiwan Park


On November 6, 2015 at 4:58:34 AM, Martin Junghanns (m.jungha...@mailbox.org) 
wrote:

Hi, not sure if that's an issue or just a misconfiguration (not familiar  
with Ruby).  

I followed the docs/README.md and ran into:  

s1ck@s1ck-T450s:~/Devel/Java/flink$ docs/build_docs.sh  
Configuration file: /home/s1ck/Devel/Java/flink/docs/_config.yml  
/home/s1ck/Devel/Java/flink/docs/_plugins/removeDuplicateLicenseHeaders.rb:63:in
  
`': cannot load such file -- jekyll/post (LoadError)  
from  
/home/s1ck/Devel/Java/flink/docs/_plugins/removeDuplicateLicenseHeaders.rb:25:in
  
`'  

When I delete removeDuplicateLicenseHeaders.rb from the _plugins folder  
everything runs fine.  

I am using:  
3.19.0-32-generic #37-Ubuntu x86_64 x86_64 x86_64 GNU/Linux  
ruby 2.1.2p95 (2014-05-08) [x86_64-linux-gnu]  
jekyll (3.0.0)  
kramdown (1.9.0)  
pygments.rb (0.6.3) // wasn't part of the README but needs to be there  

Cheers,  
Martin

Re: Vector(DenseVector) as a type?

2015-11-02 Thread Chiwan Park

Hi Daniel,

I think that you are confused about name of classes. Vector in your mail is not 
org.apache.flink.ml.math.Vector, but scala.collection.immutable.Vector which is 
immutable collection with random access.

So if you want to create a method which receives that values, you should 
clarify Vector type. You can import class with renaming like following:

```
import org.apache.flink.ml.math.{Vector => FlinkVector}
```

I hope that this answer helps you. :)

Regards,
Chiwan Park

On November 3, 2015 at 6:11:03 AM, Daniel Blazevski 
(daniel.blazev...@gmail.com) wrote:

Hello,  

I am working on the exact knn algorithm in Flink, and I'd like to make the  
structure more modular.  

I am working off of the initial work of @chiwanpark, and when I print out  
a variable to the screen, I get something like:  
```  
training.values = Vector(DenseVector(-0.206, -0.276), DenseVector(-0.206,  
-0.076),...,)  
testing.values = Vector((0,DenseVector(0.0, 0.0)))  
```  
How can I pass such a variable to a function. It seems that  
```  
myFun(A : Vector(DenseVector))  
myFun(A : Vector[DenseVector])  
```  
do not work. Tried to look around other Flink-ml code, but couldn't find  
exactly what I was looking for.  

Thanks,  
Dan

Re: Scala 2.10/2.11 Maven dependencies

2015-11-02 Thread Chiwan Park

If we choose selective Scala version suffix for artifacts, we have to tell 
which artifacts have the version suffix to newcomers. Some artifacts such as 
"flink-java”, "flink-streaming-java" are easily recognized. But IMO, knowing 
whether artifacts such as "flink-ml", "flink-clients", "flink-table" have the 
version suffix or not is difficult for newcomers.

This is why we are adding the version suffix to all Scala 2.11 artifacts 
currently. For Scala 2.10 artifacts, we aren’t adding the version suffix for 
Flink with Java users.

I’m for adding the version suffix to Scala 2.10 artifacts also. But I’m not 
sure that removing the version suffix from Java-only artifacts would be good. 
As I said above, It seems difficult for newcomers.

Regards,
Chiwan Park

On November 2, 2015 at 8:19:15 PM, Fabian Hueske (fhue...@gmail.com) wrote:

That would mean to have "flink-java_2.10" and "flink-java_2.11" artifacts  
(and others that depend on flink-java and have no other Scala dependency)  
in the 0.10.0 release and only "flink-java" in the next 1.0 release.  

Do we want that?  

2015-11-02 11:37 GMT+01:00 Maximilian Michels <m...@apache.org>:  

> I'm for leaving it as-is and renaming all artifacts which depend on  
> Scala for the release following 0.10.  
>  
> On Mon, Nov 2, 2015 at 11:32 AM, Fabian Hueske <fhue...@gmail.com> wrote:  
> > OK, let me try to summarize the discussion (and please correct me if I  
> got  
> > something wrong).  
> >  
> > 1) Flink deploys Scala 2.11 snapshot artifacts. Therefore, we have to  
> > release 2.11 artifacts for the 0.10.0 release version as well.  
> >  
> > 2) Everybody agrees to appropriately tag all artifacts that have a  
> > (transitive) Scala dependency. ATM, that would also include flink-java  
> > which is a bit awkward. The Scala dependency in flink-java originates  
> from  
> > the Chill library which is used to obtain a Kryo serializer which is  
> > initialized with serializers for Scala classes. We could resolve this  
> issue  
> > by providing Java and Scala specific implementations of the Kryo  
> > serializers and have KryoTypeInfos for Java and Scala.  
> >  
> > The question to answer right now is, do we want to have "correctly"  
> labeled  
> > artifacts for the next 0.10.0 release or do we defer that for 1.0?  
> > If we want to solve it for 0.10.0 we need to cancel the current RC and  
> > provide a fix to remove the Scala dependency in flink-java, IMO.  
> >  
> > Opinions?  
> >  
> > Cheers, Fabian  
> >  
> > 2015-11-02 8:55 GMT+01:00 Stephan Ewen <se...@apache.org>:  
> >  
> >> +1 for the approach discusses here, and for removing Scala dependencies  
> >> from modules that can be Scala independent.  
> >>  
> >> It would be nice if pure Java users would not see any Scala versioning  
> (on  
> >> flink-core, flink-java, later also flink-sreaming-java). I guess for any  
> >> runtime-related parts (including flink-client and currently all  
> streaming  
> >> projects), we need the Scala versions...  
> >>  
> >> On Sun, Nov 1, 2015 at 9:29 AM, Maximilian Michels <m...@apache.org>  
> wrote:  
> >>  
> >> > Good point. Didn't know that. We can still add them for the release.  
> >> >  
> >> > On Sat, Oct 31, 2015 at 1:51 PM, Alexander Alexandrov  
> >> > <alexander.s.alexand...@gmail.com> wrote:  
> >> > > My two cents - there are already Maven artifacts deployed for 2.11  
> in  
> >> the  
> >> > > SNAPSHOT repository. I think it might be confusing if they suddenly  
> >> > > disappear for the stable release.  
> >> > >  
> >> > >  
> >> > > 2015-10-29 11:58 GMT+01:00 Maximilian Michels <m...@apache.org>:  
> >> > >  
> >> > >> Seems like we agree that we need artifacts for different versions  
> of  
> >> > Scala  
> >> > >> on Maven. There also seems to be a preference for including the  
> >> version  
> >> > in  
> >> > >> the artifact name.  
> >> > >>  
> >> > >> I've created an issue and marked it to be resolved for 1.0. For the  
> >> 0.10  
> >> > >> release, we will have binaries but no Maven artifacts. The biggest  
> >> > >> challenge I see is to remove Scala from as many modules as  
> possible.  
> >> For  
> >> > >> example, flink-java depends on Scala at the mome

[jira] [Created] (FLINK-2950) Markdown presentation problem in SVM documentation

2015-10-30 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2950:
--

 Summary: Markdown presentation problem in SVM documentation
 Key: FLINK-2950
 URL: https://issues.apache.org/jira/browse/FLINK-2950
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Machine Learning Library
Affects Versions: 1.0
Reporter: Chiwan Park
Assignee: Chiwan Park
Priority: Minor


In the SVM documentation for master 
branch(https://ci.apache.org/projects/flink/flink-docs-master/libs/ml/svm.html),
 example section is injected in parameters section.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-2947) Coloured Scala Shell

2015-10-30 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2947:
--

 Summary: Coloured Scala Shell
 Key: FLINK-2947
 URL: https://issues.apache.org/jira/browse/FLINK-2947
 Project: Flink
  Issue Type: Improvement
  Components: Scala Shell
Reporter: Chiwan Park
Assignee: Chiwan Park
Priority: Trivial
 Fix For: 1.0


Since Scala 2.11.4, Scala REPL uses some colours to print vals and types. 
(http://www.scala-lang.org/news/2.11.4) If Flink scala shell uses this feature, 
the user can distinguish the result of execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-2841) Broken roadmap link in FlinkML contribution guide

2015-10-09 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2841:
--

 Summary: Broken roadmap link in FlinkML contribution guide
 Key: FLINK-2841
 URL: https://issues.apache.org/jira/browse/FLINK-2841
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Machine Learning Library
Affects Versions: 0.10
Reporter: Chiwan Park


Because the roadmap of FlinkML is moved to wiki, we need to update roadmap link 
in [FlinkML contribution 
guide|https://ci.apache.org/projects/flink/flink-docs-master/libs/ml/contribution_guide.html].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [DISCUSS] flink-external

2015-10-08 Thread Chiwan Park

+1 for Vasia’s suggestion. From a long-term perspective, the site like Spark 
Packages [1] would be helpful to manage external contribution.

[1] http://spark-packages.org

> On Oct 8, 2015, at 12:28 PM, Matthias J. Sax <mj...@apache.org> wrote:
> 
> Thanks for the feedback.
> 
> I think, the repository does not need to build on a single Flink
> release. From my point of view, there should be a single parent module
> that contains *independent modules* for each extension/library (there
> should be no "cross dependencies" between the modules and each module
> can specify the flink dependencies it needs by itself). This make is
> most flexible. And if a library works on an old release, it might just
> stay there as is. If a library changes (due to Flink changes), it might
> just be contained multiple times for different Flink releases.
> 
> Each module should provide a short doc (README) that shows how to use an
> integrate it with Flink. Thus, the responsibility goes to the
> contributor to maintain the library. If it breaks and is not maintained
> any further, we can simple remove it.
> 
> I agree, that the community might not be able to maintain those
> extension/libraries right now. I would put the responsibility (more or
> less completely) on the contributor and delete project that do not fix
> any more.
> 
> @Vasia: a link to a library could be included in the README. If anybody
> only wants to share a library but not contribute code, the parent README
> could contain a list of additional links.
> 
> 
> -Matthias
> 
> 
> On 10/08/2015 12:15 PM, Vasiliki Kalavri wrote:
>> How about, for now, we simply create a page where we gather links/short
>> descriptions of all these contributions
>> and let the maintenance and dependency management to the tool/library
>> creators?
>> This way we will at least have these contributions in one place and link to
>> them somewhere from the website.
>> 
>> -Vasia.
>> 
>> On 8 October 2015 at 12:06, Maximilian Michels <m...@apache.org> wrote:
>> 
>>> Hi Matthias,
>>> 
>>> Thanks for bringing up this idea. Actually, it has been discussed a
>>> couple of times on the mailing list whether we should have a central
>>> place for third-party extensions/contributions/libraries. This could
>>> either be something package-based or, like you proposed, another
>>> repository.
>>> 
>>> An external place for contributions raises a couple of questions
>>> 
>>> - Which version should the external contributions be based on?
>>> - How do we make sure, the extensions are continuously updated?
>>> (dedicated maintainers or automatic compatibility checks)
>>> - How do we easily plug-in the external modules into Flink?
>>> 
>>> In the long term, we really need a solution for these questions. The
>>> code base of Flink is growing and more and more packages go to
>>> flink-contrib/flink-staging. I would find something packaged-based
>>> better than a repository. Quite frankly, momentarily, I think
>>> developing such a plugin system is out of scope for most Flink
>>> developers. At the current pace of Flink development, collecting these
>>> contributions externally without properly maintaining them, doesn't
>>> make much sense to me.
>>> 
>>> Cheers,
>>> Max
>>> 
>>> 
>>> 
>>> On Wed, Oct 7, 2015 at 11:42 AM, Matthias J. Sax <mj...@apache.org> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> many people are building quite exiting stuff on top of Flink. It is hard
>>>> to keep an good overview on what stuff is available and what not. What
>>>> do you think about starting a second git repository "flink-external"
>>>> that collects all those code?
>>>> 
>>>> The ideas would be to collect stuff in a central point, such that people
>>>> can access it easily and get an overview what is already available (this
>>>> might also avoid duplicate development). It might also be a good point
>>>> to show common patterns. In order to collect as much as possible, the
>>>> contributing requirement (with respect to testing etc) could be lower
>>>> than for Flink itself.
>>>> 
>>>> For example, I recently started a small flink-clojure module with a
>>>> simple word-count example to answer a question on SO. Including this in
>>>> Flink would not be appropriate. However, for a flink-external repro it
>>>> might be nice to have.
>>>> 
>>>> What do you think about it?
>>>> 
>>>> 
>>>> -Matthias
>>>> 
>>> 
>> 
> 



Regards,
Chiwan Park

Re: Will to contribute

2015-10-07 Thread Chiwan Park

Hi Dawid,

Welcome to Flink community! :)
I left some comments for your pull request.

> On Oct 7, 2015, at 7:29 PM, Dawid Wysakowicz <wysakowicz.da...@gmail.com> 
> wrote:
> 
> Hi all,
> 
> I liked the idea behind flink and I would be happy if I could help. On
> project's github page you mentioned you can help find some starter tasks
> that I would appreciate. I tried to search some on my own I even created a
> PR for FLINK-2156, but I couldn't find any bigger one.
> 
> Looking forward for any response.
> 
> Regards
> Dawid


Regards,
Chiwan Park

Re: Extending and improving our "How to contribute" page

2015-09-28 Thread Chiwan Park

@Fabian, Could you cover FLINK-2712 in your pull request? I think that it would 
be better than split pull request.

Regards,
Chiwan Park

> On Sep 28, 2015, at 4:51 PM, Fabian Hueske <fhue...@gmail.com> wrote:
> 
> Thanks everybody for the discussion.
> I'll prepare a pull request to update the "How to contribute" and "Coding
> Guidelines".
> 
> Thanks,
> Fabian
> 
> 2015-09-26 9:06 GMT+02:00 Maximilian Michels <m...@apache.org>:
> 
>> Hi Fabian,
>> 
>> This is a very important topic. Thanks for starting the discussion.
>> 
>> 1) JIRA discussion
>> 
>> Absolutely. No new feature should be introduced without a discussion.
>> Frankly, I see the problem that sometimes discussions only come up
>> when the pull request has been opened. However, this can be overcome
>> by the design document.
>> 
>> 2) Design document
>> 
>> +1 for the document. It increases transparency but also helps the
>> contributor to think his idea through before starting to code. The
>> document could also be written directly in JIRA. That way, it is more
>> accessible. JIRA offers mark up; even images can be attached and
>> displayed in the JIRA description.
>> 
>> I'd like to propose another section "Limitations" for the design
>> document. Breaking API changes should also be listed on a special Wiki
>> page.
>> 
>> 3) Coding style
>> 
>> In addition to updating the document, do we want to enforce coding
>> styles also by adding new Maven Checkstyle rules? IMHO strict rules
>> could cause more annoyances than they actually contribute to the
>> readability of the code. Perhaps this should be discussed in a
>> separate thread.
>> 
>> +1 for collecting common problems and design patterns to include them
>> in the document. I was thinking, that we should also cover some of the
>> features of tools and dependencies we heavily use, e.g. Travis,
>> Mockito, Guava, Log4j, FlinkMiniCluster, Unit testing vs IT cases,
>> etc.
>> 
>> 4 ) Restructuring the how to contribute guide
>> 
>> Good idea to have a meta document that explains how contributing works
>> in general, and another document for technical things.
>> 
>> 
>> Cheers,
>> Max
>> 
>> 
>> On Thu, Sep 24, 2015 at 2:53 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>>> 
>>> Thanks everybody for feedback and comments.
>>> 
>>> Regarding 1) and 2):
>>> 
>>> I like the idea of keeping the discussion of new features and
>> improvements
>>> in JIRA as Kostas proposed.
>>> Our coding guidelines [1] already request a JIRA issue for each pull
>>> request.
>>> 
>>> How about we highlight this requirement more prominently and follow this
>>> rule more strict from now on.
>>> JIRA issues for new features and improvements should clearly specify the
>>> scope and requirements for the new feature / improvement.
>>> The level of detail is up to the reporter of the issue, but the community
>>> can request more detail or change the scope and requirements by
>> discussion.
>>> When a JIRA issue for a new feature or improvement is opened, the
>> community
>>> can start a discussion whether the feature is desirable for Flink or not.
>>> Any contributor (including the reporter) can also attach a
>>> "design-doc-requested" label to the issue. A design document can be
>>> proposed by anybody, including the reporter or assignee of the JIRA
>> issue.
>>> However, the issue cannot be resolved and a corresponding PR not be
>> merged
>>> before a design document has been accepted by lazy consensus. Hence, an
>>> assignee should propose a design doc before starting to code to avoid
>> major
>>> redesigns of the implementation.
>>> 
>>> This way it is up to the community when to start a discussion about
>> whether
>>> a feature request is accepted or to request a design document. We can
>> make
>>> design documents mandatory for changes that touch the public API.
>>> 
>>> Regarding 3):
>>> 
>>> I agree with Vasia, that we should collect suggestions for common
>> patterns
>>> and also continuously update the coding guidelines.
>>> @Henry, I had best practices (exception handling, tests, etc.) in mind.
>>> Syntactic code style is important as well, but we should have a separate
>>> discussion about that, IMO.
>>> 
>>> Proposal for a design do

[jira] [Created] (FLINK-2768) Wrong Java version requirements in "Quickstart: Scala API" page

2015-09-25 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2768:
--

 Summary: Wrong Java version requirements in "Quickstart: Scala 
API" page
 Key: FLINK-2768
 URL: https://issues.apache.org/jira/browse/FLINK-2768
 Project: Flink
  Issue Type: Bug
Affects Versions: 0.10
Reporter: Chiwan Park


Since Flink 0.10, we dropped Java 6 support. But "[Quickstart: Scala 
API|https://ci.apache.org/projects/flink/flink-docs-master/quickstart/scala_api_quickstart.html];
 page says that Java 6 is one of minimum requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-2767) Add support Scala 2.11 to Scala shell

2015-09-25 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2767:
--

 Summary: Add support Scala 2.11 to Scala shell
 Key: FLINK-2767
 URL: https://issues.apache.org/jira/browse/FLINK-2767
 Project: Flink
  Issue Type: Improvement
  Components: Scala Shell
Affects Versions: 0.10
Reporter: Chiwan Park
Assignee: Chiwan Park


Since FLINK-2200 is resolved, the Flink community provides JARs for Scala 2.11. 
But currently, there is no Scala shell with Scala 2.11. If we add support Scala 
2.11 to Scala shell, the user with Scala 2.11 could use Flink easily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Extending and improving our "How to contribute" page

2015-09-24 Thread Chiwan Park

Thanks Fabian for starting the discussion.

+1 for overall approach.

About (1), expressing that consensus must be required for new feature in “How 
to contribute” page is very nice. Some pull requests were sent without 
consensus. The contributors had to rewrote their pull requests.

Agree with (2), (3) and (4).

Regards,
Chiwan Park

> On Sep 24, 2015, at 2:23 AM, Henry Saputra <henry.sapu...@gmail.com> wrote:
> 
> Thanks again, Fabian for starting the discussions.
> 
> For (1) and (2) I think it is good idea and will help people to
> understand and follow the author thought process.
> Following up with Stephan's reply, some new features solutions could
> be explained thoroughly in the PR descriptions but some requires
> additional reviews of the proposed design.
> I like the idea of using tag in JIRA whether new features should or
> should not being accompanied by design document.
> 
> Agree with (3) and (4).
> As for (3) are you thinking about more of style of code syntax via
> checkstyle updates, or best practices in term of no mutable state if
> possible, throw precise Exception if possible for interfaces, etc. ?
> 
> - Henry
> 
> 
> 
> 
> On Wed, Sep 23, 2015 at 9:31 AM, Stephan Ewen <se...@apache.org> wrote:
>> Thanks, Fabian for driving this!
>> 
>> I agree with your points.
>> 
>> Concerning Vasia's comment to not raise the bar too high:
>> That is true, the requirements should be reasonable. We can definitely tag
>> issues as "simple" which means they do not require a design document. That
>> should be more for new features and needs not be very detailed.
>> 
>> We could also make the inverse, meaning we explicitly tag certain issues as
>> "requires design document".
>> 
>> Greetings,
>> Stephan
>> 
>> 
>> 
>> 
>> On Wed, Sep 23, 2015 at 5:05 PM, Vasiliki Kalavri <vasilikikala...@gmail.com
>>> wrote:
>> 
>>> Hi,
>>> 
>>> I agree with you Fabian. Clarifying these issues in the "How to Contribute"
>>> guide will save lots of time both to reviewers and contributors. It is a
>>> really disappointing situation when someone spends time implementing
>>> something and their PR ends up being rejected because either the feature
>>> was not needed or the implementation details were never agreed on.
>>> 
>>> That said, I think we should also make sure that we don't raise the bar too
>>> high for simple contributions.
>>> 
>>> Regarding (1) and (2), I think we should clarify what kind of
>>> additions/changes require this process to be followed. e.g. do we need to
>>> discuss additions for which JIRAs already exist? Ideas described in the
>>> roadmaps? Adding a new algorithm to Gelly/Flink-ML?
>>> 
>>> Regarding (3), maybe we can all suggest some examples/patterns that we've
>>> seen when reviewing PRs and then choose the most common (or all).
>>> 
>>> (4) sounds good to me.
>>> 
>>> Cheers,
>>> Vasia.
>>> 
>>> On 23 September 2015 at 15:08, Kostas Tzoumas <ktzou...@apache.org> wrote:
>>> 
>>>> Big +1.
>>>> 
>>>> For (1), a discussion in JIRA would also be an option IMO
>>>> 
>>>> For (2), let us come up with few examples on what constitutes a feature
>>>> that needs a design doc, and what should be in the doc (IMO
>>>> architecture/general approach, components touched, interfaces changed)
>>>> 
>>>> 
>>>> 
>>>> On Wed, Sep 23, 2015 at 2:24 PM, Fabian Hueske <fhue...@gmail.com>
>>> wrote:
>>>> 
>>>>> Hi everybody,
>>>>> 
>>>>> I guess we all have noticed that the Flink community is quickly growing
>>>> and
>>>>> more and more contributions are coming in. Recently, a few
>>> contributions
>>>>> proposed new features without being discussed on the mailing list. Some
>>>> of
>>>>> these contributions were not accepted in the end. In other cases, pull
>>>>> requests had to be heavily reworked because the approach taken was not
>>>> the
>>>>> best one. These are situations which should be avoided because both the
>>>>> contributor as well as the person who reviewed the contribution
>>> invested
>>>> a
>>>>> lot of time for nothing.
>>>>> 
>>>>> I had a look at our “How to contribute” and “Coding guideline” pages
>>> and
>>

Re: Tests - Unit Tests versus Integration Tests

2015-09-19 Thread Chiwan Park

Okay, I’ll create a JIRA issue and send a pull request for it. :)

Regards,
Chiwan Park

> On Sep 19, 2015, at 7:35 PM, Ufuk Celebi <u...@apache.org> wrote:
> 
> Thanks Stephan for pointing this out. I agree with you. +1
> 
> @Chiwan: Good idea with the Wiki. Actually maybe even better to add it to the 
> contribution guide? Do you have time to open a PR with Stephan’s suggestions?
> 
> @Martin: I agree that it does not suffice to just point this out as a new 
> guideline. I think the main problem is that it is very time consuming and 
> error prone. We have seen some “minor refactorings” lately, which looked 
> harmless, but actually introduced bugs. This is a danger as well with 
> refactoring tests (we refactor them, but don’t have the same amount of test 
> coverage, which results in bugs at some point in time in the future).
> 
> Are there any known “heavy hitters”, which take a lot of time, but which 
> could be tested in unit tests instead? I would start with those if you want 
> to do it. But in general I would do this incrementally instead of aiming for 
> a complete rewrite.
> 
> – Ufuk
> 
>> On 19 Sep 2015, at 10:53, Martin Liesenberg <martin.liesenb...@gmail.com> 
>> wrote:
>> 
>> Should there be a concerted effort to reduce the amount of unnecessary
>> integration tests and cover those cases by unit tests?
>> 
>> We could collect the cases in a ticket and work through the list one by
>> one, no?
>> 
>> Best regards,
>> Martin
>> 
>> Chiwan Park <chiwanp...@apache.org> schrieb am Fr., 18. Sep. 2015 um
>> 12:33 Uhr:
>> 
>>> Hi Stephan,
>>> 
>>> Thanks for nice guide! I think we can upload this to the wiki or how to
>>> contribute documentation.
>>> This guide would be helpful for newcomers.
>>> 
>>> Regards,
>>> Chiwan Park
>>> 
>>>> On Sep 17, 2015, at 9:33 PM, Stephan Ewen <se...@apache.org> wrote:
>>>> 
>>>> Hi all!
>>>> 
>>>> The build time of Flink with all tests is nearing 1h on Travis for the
>>>> shortest run.
>>>> It is good that we do excessive testing, there are many mechanisms that
>>>> need that.
>>>> 
>>>> I have also seen that a lot of fixes that could be tested in a UnitTest
>>>> style are actually tested as a full Flink program (Integration test
>>> style)
>>>> 
>>>> While these tests are always easier to write, they have two problems:
>>>> - The bring up the build time by about 5 secs per test
>>>> - They are often not as targeted to the problem as a UnitTest
>>>> 
>>>> I would like to encourage everyone to keep this in mind and do Unit tests
>>>> in the cases where they are the preferred choice. Please also keep that
>>> in
>>>> mind when reviewing pull requests.
>>>> 
>>>> For Example:
>>>> - API / TypeInformation changes can be very well tested without running
>>>> the program. Simply create the program and test the operator's type info.
>>>> - Custom functions can be very well tested in isolation
>>>> - Input/Output formats actually test well in UnitTests.
>>>> 
>>>> Integration tests need to be used when verifying behavior across
>>> components
>>>> / layers, so keep using them when they need to be used.
>>>> 
>>>> 
>>>> Greetings,
>>>> Stephan
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>

[jira] [Created] (FLINK-2712) Add some description about tests to "How to Contribute" documentation

2015-09-19 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2712:
--

 Summary: Add some description about tests to "How to Contribute" 
documentation
 Key: FLINK-2712
 URL: https://issues.apache.org/jira/browse/FLINK-2712
 Project: Flink
  Issue Type: Task
Reporter: Chiwan Park
Assignee: Chiwan Park
Priority: Minor


In maling list, [~StephanEwen] post a guideline about unit tests and 
integration tests 
(http://mail-archives.apache.org/mod_mbox/flink-dev/201509.mbox/%3cCANC1h_vvekciNVDzqCb8N4E5Kfzu4e1Mosnse1=v11hxnd2...@mail.gmail.com%3e).
 If we add the guideline to "How to Contribute" documentation, it would be 
helpful for newcomers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Tests - Unit Tests versus Integration Tests

2015-09-19 Thread Chiwan Park

I just created a JIRA issue [1].

Regards,
Chiwan Park

[1] https://issues.apache.org/jira/browse/FLINK-2712

> On Sep 20, 2015, at 1:33 AM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
> Okay, I’ll create a JIRA issue and send a pull request for it. :)
> 
> Regards,
> Chiwan Park
> 
>> On Sep 19, 2015, at 7:35 PM, Ufuk Celebi <u...@apache.org> wrote:
>> 
>> Thanks Stephan for pointing this out. I agree with you. +1
>> 
>> @Chiwan: Good idea with the Wiki. Actually maybe even better to add it to 
>> the contribution guide? Do you have time to open a PR with Stephan’s 
>> suggestions?
>> 
>> @Martin: I agree that it does not suffice to just point this out as a new 
>> guideline. I think the main problem is that it is very time consuming and 
>> error prone. We have seen some “minor refactorings” lately, which looked 
>> harmless, but actually introduced bugs. This is a danger as well with 
>> refactoring tests (we refactor them, but don’t have the same amount of test 
>> coverage, which results in bugs at some point in time in the future).
>> 
>> Are there any known “heavy hitters”, which take a lot of time, but which 
>> could be tested in unit tests instead? I would start with those if you want 
>> to do it. But in general I would do this incrementally instead of aiming for 
>> a complete rewrite.
>> 
>> – Ufuk
>> 
>>> On 19 Sep 2015, at 10:53, Martin Liesenberg <martin.liesenb...@gmail.com> 
>>> wrote:
>>> 
>>> Should there be a concerted effort to reduce the amount of unnecessary
>>> integration tests and cover those cases by unit tests?
>>> 
>>> We could collect the cases in a ticket and work through the list one by
>>> one, no?
>>> 
>>> Best regards,
>>> Martin
>>> 
>>> Chiwan Park <chiwanp...@apache.org> schrieb am Fr., 18. Sep. 2015 um
>>> 12:33 Uhr:
>>> 
>>>> Hi Stephan,
>>>> 
>>>> Thanks for nice guide! I think we can upload this to the wiki or how to
>>>> contribute documentation.
>>>> This guide would be helpful for newcomers.
>>>> 
>>>> Regards,
>>>> Chiwan Park
>>>> 
>>>>> On Sep 17, 2015, at 9:33 PM, Stephan Ewen <se...@apache.org> wrote:
>>>>> 
>>>>> Hi all!
>>>>> 
>>>>> The build time of Flink with all tests is nearing 1h on Travis for the
>>>>> shortest run.
>>>>> It is good that we do excessive testing, there are many mechanisms that
>>>>> need that.
>>>>> 
>>>>> I have also seen that a lot of fixes that could be tested in a UnitTest
>>>>> style are actually tested as a full Flink program (Integration test
>>>> style)
>>>>> 
>>>>> While these tests are always easier to write, they have two problems:
>>>>> - The bring up the build time by about 5 secs per test
>>>>> - They are often not as targeted to the problem as a UnitTest
>>>>> 
>>>>> I would like to encourage everyone to keep this in mind and do Unit tests
>>>>> in the cases where they are the preferred choice. Please also keep that
>>>> in
>>>>> mind when reviewing pull requests.
>>>>> 
>>>>> For Example:
>>>>> - API / TypeInformation changes can be very well tested without running
>>>>> the program. Simply create the program and test the operator's type info.
>>>>> - Custom functions can be very well tested in isolation
>>>>> - Input/Output formats actually test well in UnitTests.
>>>>> 
>>>>> Integration tests need to be used when verifying behavior across
>>>> components
>>>>> / layers, so keep using them when they need to be used.
>>>>> 
>>>>> 
>>>>> Greetings,
>>>>> Stephan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
> 
> 
> 
>

Re: Tests - Unit Tests versus Integration Tests

2015-09-18 Thread Chiwan Park

Hi Stephan,

Thanks for nice guide! I think we can upload this to the wiki or how to 
contribute documentation.
This guide would be helpful for newcomers.

Regards,
Chiwan Park

> On Sep 17, 2015, at 9:33 PM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi all!
> 
> The build time of Flink with all tests is nearing 1h on Travis for the
> shortest run.
> It is good that we do excessive testing, there are many mechanisms that
> need that.
> 
> I have also seen that a lot of fixes that could be tested in a UnitTest
> style are actually tested as a full Flink program (Integration test style)
> 
> While these tests are always easier to write, they have two problems:
>  - The bring up the build time by about 5 secs per test
>  - They are often not as targeted to the problem as a UnitTest
> 
> I would like to encourage everyone to keep this in mind and do Unit tests
> in the cases where they are the preferred choice. Please also keep that in
> mind when reviewing pull requests.
> 
> For Example:
>  - API / TypeInformation changes can be very well tested without running
> the program. Simply create the program and test the operator's type info.
>  - Custom functions can be very well tested in isolation
>  - Input/Output formats actually test well in UnitTests.
> 
> Integration tests need to be used when verifying behavior across components
> / layers, so keep using them when they need to be used.
> 
> 
> Greetings,
> Stephan

[jira] [Created] (FLINK-2619) Some Scala Tests not being executed by Maven

2015-09-04 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2619:
--

 Summary: Some Scala Tests not being executed by Maven
 Key: FLINK-2619
 URL: https://issues.apache.org/jira/browse/FLINK-2619
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Chiwan Park
Priority: Critical


Some Scala Tests are not executed by Maven. Originally this issue are reported 
by [~StephanEwen]. I also executed {{mvn clean verify}} and found the same 
circumstance.

Original post is 
[here|http://mail-archives.apache.org/mod_mbox/flink-dev/201508.mbox/%3ccanc1h_tlgt-rrtybua6c0ypihhv8w1bwyb4sogahn9_cfck...@mail.gmail.com%3e]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Several Scala Tests not being executed by Maven

2015-09-04 Thread Chiwan Park

I just created the JIRA issue [1].

Regards,
Chiwan Park

[1] https://issues.apache.org/jira/browse/FLINK-2619

> On Sep 4, 2015, at 6:43 PM, Chiwan Park <chiwanp...@apache.org> wrote:
> 
> I also found the same circumstances. Although I add a fail test case in 
> ExecutionGraphRestartTest, `mvn clean verify` doesn’t fail. I will create an 
> issue covered this.
> 
> Regards,
> Chiwan Park
> 
> 
>> On Aug 29, 2015, at 10:13 PM, Stephan Ewen <se...@apache.org> wrote:
>> 
>> Hi!
>> 
>> I found quite a few tests that are not actually executed by Maven as part
>> of the builds. Some actually are in error.
>> 
>> ExecutionGraphRestartTest
>> TaskManagerLossFailsTasksTest
>> JobManagerRegistrationTest
>> ...
>> 
>> All of those are Scala tests with WordSpecLike traits. Seems that this
>> configuration has a problem with the Maven unit test plugin.
>> 
>> Anyone else observed something like this before?
>> 
>> Stephan
> 
> 
>

Re: Several Scala Tests not being executed by Maven

2015-09-04 Thread Chiwan Park

I also found the same circumstances. Although I add a fail test case in 
ExecutionGraphRestartTest, `mvn clean verify` doesn’t fail. I will create an 
issue covered this.

Regards,
Chiwan Park


> On Aug 29, 2015, at 10:13 PM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi!
> 
> I found quite a few tests that are not actually executed by Maven as part
> of the builds. Some actually are in error.
> 
> ExecutionGraphRestartTest
> TaskManagerLossFailsTasksTest
> JobManagerRegistrationTest
> ...
> 
> All of those are Scala tests with WordSpecLike traits. Seems that this
> configuration has a problem with the Maven unit test plugin.
> 
> Anyone else observed something like this before?
> 
> Stephan

Re: [ANNOUNCE] Welcome Matthias Sax as new committer

2015-09-02 Thread Chiwan Park

Welcome Matthias! :)

Regards,
Chiwan Park

> On Sep 2, 2015, at 8:30 PM, Kostas Tzoumas <ktzou...@apache.org> wrote:
> 
> The Project Management Committee (PMC) of Apache Flink has asked Matthias
> Sax to become a committer, and we are pleased to announce that he has
> accepted.
> 
> Matthias has been very active with Flink, and he is the original
> contributor of the Storm compatibility functionality.
> 
> Being a committer enables easier contribution to the project since there is no
> need to go via the pull request submission process. This should enable better
> productivity. Being a PMC member enables assistance with the management and
> to guide the direction of the project.
> 
> Please join me in welcoming Matthias as a new committer!

Re: [VOTE] Release Apache Flink 0.9.1 (RC1)

2015-08-31 Thread Chiwan Park

+1

File signature looks correct
Compiled and passed all tests
Built well with Hadoop 2.7 / Scala 2.10 and Scala 2.11 both
Ran examples in local mode and cluster with 3 machines using FliRTT
  - ConnectedComponents
  - EnumTrianglesBasic
  - EnumTrianglesOpt
  - KMeans
  - PageRankBasic
  - TransitiveClosure
  - WebLogAnalysis
  - WordCount
  - WordCountPOJO

Regards,
Chiwan Park

> On Aug 31, 2015, at 1:24 PM, Henry Saputra <henry.sapu...@gmail.com> wrote:
> 
> +1
> 
> LICENSE file looks good
> NOTICE file looks good
> Signature files look good
> Hash files look good
> Source compile and pass tests
> Run on Hadoop YARN 2.7
> Standalone tests work
> No 3rd party exes in source artifacts
> 
> - Henry
> 
> On Fri, Aug 28, 2015 at 4:27 AM, Ufuk Celebi <u...@apache.org> wrote:
>> Dear community,
>> 
>> Please vote on releasing the following candidate as Apache Flink version
>> 0.9.1. This is a maintenance release for Flink's latest stable version. The
>> candidate fixes 38 issues [1] and adds 47 commits.
>> 
>> This is the second RC for this release.
>> 
>> -
>> The commit to be voted on:
>> be932cd52ba41cfc6f45846acfd4b8a3a473ced2
>> 
>> Branch:
>> release-0.9.1-rc1 (
>> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-0.9.1-rc1
>> )
>> 
>> The release artifacts to be voted on can be found at:
>> http://people.apache.org/~uce/flink-0.9.1-rc1/
>> 
>> Release artifacts are signed with the key with fingerprint 9D403309:
>> http://www.apache.org/dist/flink/KEYS
>> 
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapacheflink-1044
>> -
>> 
>> Please vote on releasing this package as Apache Flink 0.9.1.
>> 
>> The vote is open for the next 72 hours and passes if a majority of at least
>> three +1 PMC votes are cast.
>> 
>> The vote ends on Monday (August 31, 2015).
>> 
>> [ ] +1 Release this package as Apache Flink 0.9.1
>> [ ] -1 Do not release this package, because...
>> 
>> – Ufuk
>> 
>> [1]
>> https://issues.apache.org/jira/browse/FLINK-2572?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%200.9.1

Re: [DISCUSSION] Release current master as 0.9.1 (mod few changes)

2015-08-26 Thread Chiwan Park

Robert's suggestion looks good. +1

Sent from my iPhone

 On Aug 26, 2015, at 9:55 PM, Aljoscha Krettek aljos...@apache.org wrote:
 
 +1 seems to be a viable solution
 
 On Wed, 26 Aug 2015 at 14:51 Stephan Ewen se...@apache.org wrote:
 
 That sounds like a very good compromise.
 
 +1
 
 On Wed, Aug 26, 2015 at 2:48 PM, Fabian Hueske fhue...@gmail.com wrote:
 
 I'm +1 for Robert's proposal as well.
 
 2015-08-26 14:46 GMT+02:00 Ufuk Celebi u...@apache.org:
 
 +1
 
 I very much like Robert's suggestion. This way we can proceed with the
 0.9.1 release as planned for the remaining part and have
 0.10-milestone1
 with the fix.
 
 What about the others? Please give feedback early to allow me to
 proceed
 with the release.

Re: Flink color scheme

2015-08-23 Thread Chiwan Park

Thank you for sharing!

Regards,
Chiwan Park

 On Aug 23, 2015, at 10:36 PM, Kostas Tzoumas ktzou...@apache.org wrote:
 
 Hi folks,
 
 I have a color scheme for Flink that people can use for presentations, blog
 posts, etc, based on the Flink logo colors:
 
 https://www.dropbox.com/sh/dlstvzw2xzt09hx/AADpzAAmVUuAunWR2RJh7zjYa?dl=0
 
 I'm not saying that we have to use it, just something that is out there in
 case someone wants it.
 
 Best,
 Kostas

Re: [ANNOUNCE] New Committer Chesnay Schepler

2015-08-20 Thread Chiwan Park

Congrats Chesnay!

Regards,
Chiwan Park

 On Aug 20, 2015, at 7:39 PM, Gyula Fóra gyf...@apache.org wrote:
 
 Welcome! :)
 
 On Thu, Aug 20, 2015 at 12:34 PM Matthias J. Sax 
 mj...@informatik.hu-berlin.de wrote:
 
 Congrats! The squirrel army is growing fast. :)
 
 On 08/20/2015 11:18 AM, Robert Metzger wrote:
 The Project Management Committee (PMC) for Apache Flink has asked Chesnay
 Schepler to become a committer and we are pleased to announce that they
 have accepted.
 
 Chesnay has been very involved with the Flink project since its pre-ASF
 days. He has worked on several components including the Java API,
 documentation, and execution engine. Recently he made a big contribution
 and added a Python API to Flink.
 
 Being a committer enables easier contribution to the project since there
 is
 no need to go via the pull request submission process. This should enable
 better productivity. Being a PMC member enables assistance with the
 management and to guide the direction of the project.

Re: Code style guideline for Scala

2015-08-18 Thread Chiwan Park

Okay, I’ll create a JIRA issue covered this topic.

Regards,
Chiwan Park

 On Aug 17, 2015, at 1:17 AM, Stephan Ewen se...@apache.org wrote:
 
 +1 for formatting templates for Eclipse and IntelliJ.
 
 On Sun, Aug 16, 2015 at 6:06 PM, Sachin Goel sachingoel0...@gmail.com
 wrote:
 
 We should also write up a matching configuration file to be used in the
 IDEs and provide it with the source. This might help in reducing any style
 mistakes due to a reformat, which is actually very helpful with spaces
 around braces and operators. Especially with Scala, indentations and
 continuation etc. can be hard to get exactly right [At least that was my
 experience].
 
 All in all, big plus one to this.
 
 -- Sachin Goel
 Computer Science, IIT Delhi
 m. +91-9871457685
 
 On Sun, Aug 16, 2015 at 7:36 PM, Stephan Ewen se...@apache.org wrote:
 
 Hi!
 
 I very much support that. A bit stricter rules in the style checkers lead
 to more uniform and better readable code. We can have stricter rules both
 in Java and Scala.
 
 Note that the hardest part of adding the style checks is actually
 adjusting
 all the existing code that violates the style.
 
 The best approach would probably be for someone to make a suggestion what
 should go into the checkstyle, and then reiterate on it.
 
 Greetings,
 Stephan
 
 
 
 
 On Sun, Aug 16, 2015 at 12:14 PM, Chiwan Park chiwanp...@apache.org
 wrote:
 
 Hi All,
 
 I’m reviewing some pull requests written in Scala. While reviewing, I
 think that scala style checker is too loose and documentation about
 code
 style guideline in wiki [1] is poor. The code style for Scala doesn’t
 seems
 unified as that for Java.
 
 I suggest upgrading version of scalastyle-maven-plugin to 0.7.0, adding
 some rules such as NoWhitespaceBeforeLeftBracketChecker,
 EnsureSingleSpaceAfterTokenChecker, IndentationChecker, and
 MagicNumberChecker and updating the documentation in wiki.
 
 I hope to discuss the code style for Scala. How think you about this?
 
 Regards,
 Chiwan Park
 
 [1]
 
 
 https://cwiki.apache.org/confluence/display/FLINK/Coding+Guidelines+for+Scala

Re: Code style guideline for Scala

2015-08-18 Thread Chiwan Park

Creating a JIRA issue [1] is done.

Regards,
Chiwan Park

[1] https://issues.apache.org/jira/browse/FLINK-2539


 On Aug 18, 2015, at 5:28 PM, Till Rohrmann trohrm...@apache.org wrote:
 
 Good initiative Chiwan. +1 for a more unified code style.
 
 On Tue, Aug 18, 2015 at 10:25 AM, Chiwan Park chiwanp...@apache.org wrote:
 
 Okay, I’ll create a JIRA issue covered this topic.
 
 Regards,
 Chiwan Park
 
 On Aug 17, 2015, at 1:17 AM, Stephan Ewen se...@apache.org wrote:
 
 +1 for formatting templates for Eclipse and IntelliJ.
 
 On Sun, Aug 16, 2015 at 6:06 PM, Sachin Goel sachingoel0...@gmail.com
 wrote:
 
 We should also write up a matching configuration file to be used in the
 IDEs and provide it with the source. This might help in reducing any
 style
 mistakes due to a reformat, which is actually very helpful with spaces
 around braces and operators. Especially with Scala, indentations and
 continuation etc. can be hard to get exactly right [At least that was my
 experience].
 
 All in all, big plus one to this.
 
 -- Sachin Goel
 Computer Science, IIT Delhi
 m. +91-9871457685
 
 On Sun, Aug 16, 2015 at 7:36 PM, Stephan Ewen se...@apache.org wrote:
 
 Hi!
 
 I very much support that. A bit stricter rules in the style checkers
 lead
 to more uniform and better readable code. We can have stricter rules
 both
 in Java and Scala.
 
 Note that the hardest part of adding the style checks is actually
 adjusting
 all the existing code that violates the style.
 
 The best approach would probably be for someone to make a suggestion
 what
 should go into the checkstyle, and then reiterate on it.
 
 Greetings,
 Stephan
 
 
 
 
 On Sun, Aug 16, 2015 at 12:14 PM, Chiwan Park chiwanp...@apache.org
 wrote:
 
 Hi All,
 
 I’m reviewing some pull requests written in Scala. While reviewing, I
 think that scala style checker is too loose and documentation about
 code
 style guideline in wiki [1] is poor. The code style for Scala doesn’t
 seems
 unified as that for Java.
 
 I suggest upgrading version of scalastyle-maven-plugin to 0.7.0,
 adding
 some rules such as NoWhitespaceBeforeLeftBracketChecker,
 EnsureSingleSpaceAfterTokenChecker, IndentationChecker, and
 MagicNumberChecker and updating the documentation in wiki.
 
 I hope to discuss the code style for Scala. How think you about this?
 
 Regards,
 Chiwan Park
 
 [1]
 
 
 
 https://cwiki.apache.org/confluence/display/FLINK/Coding+Guidelines+for+Scala

[jira] [Created] (FLINK-2539) More unified code style for Scala code

2015-08-18 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2539:
--

 Summary: More unified code style for Scala code
 Key: FLINK-2539
 URL: https://issues.apache.org/jira/browse/FLINK-2539
 Project: Flink
  Issue Type: Improvement
Reporter: Chiwan Park
Priority: Minor


We need more specific code style guide for Scala to prevent code style 
differentiation. We discussed about this in [mailing 
list|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Code-style-guideline-for-Scala-td7526.html].

Following works are needed:
* Providing code formatting configuration for Eclipse and IntelliJ IDEA
* More detail description in 
[wiki|https://cwiki.apache.org/confluence/display/FLINK/Coding+Guidelines+for+Scala]
 and [Coding Guidelines|http://flink.apache.org/coding-guidelines.html] in 
homepage
* More strict rules in scala style checker (We need to discuss more)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Code style guideline for Scala

2015-08-16 Thread Chiwan Park

Hi All,

I’m reviewing some pull requests written in Scala. While reviewing, I think 
that scala style checker is too loose and documentation about code style 
guideline in wiki [1] is poor. The code style for Scala doesn’t seems unified 
as that for Java.

I suggest upgrading version of scalastyle-maven-plugin to 0.7.0, adding some 
rules such as NoWhitespaceBeforeLeftBracketChecker, 
EnsureSingleSpaceAfterTokenChecker, IndentationChecker, and MagicNumberChecker 
and updating the documentation in wiki.

I hope to discuss the code style for Scala. How think you about this?

Regards,
Chiwan Park

[1] 
https://cwiki.apache.org/confluence/display/FLINK/Coding+Guidelines+for+Scala

Re: Getting an eeror while running the code

2015-07-27 Thread Chiwan Park

Oh, I confused Streaming API with Batch API. :) Stephen’s comment will help you.

Regards,
Chiwan Park

 On Jul 27, 2015, at 4:22 PM, Stephan Ewen se...@apache.org wrote:
 
 Your program gives this exception: java.lang.UnsupportedClassVersionError:
 
 This usually means that a JVM tries to load code that has been compiled
 with a newer Java version. For example, Java 7 running a Java 8 program.
 
 Per stack Overflow:
 http://stackoverflow.com/questions/10382929/how-to-fix-unsupported-major-minor-version-51-0-error
 
 
 
 On Mon, Jul 27, 2015 at 7:22 AM, Chiwan Park chiwanp...@apache.org wrote:
 
 Hi, print() method runs the program immediately. After execution, there is
 no sink in
 the program. You should remove calling execute() method after calling
 print() method.
 
 There is more detail description [1][2] in Flink documentation. I hope
 that this helps.
 
 Regards,
 Chiwan Park
 
 [1]
 https://ci.apache.org/projects/flink/flink-docs-release-0.9/apis/programming_guide.html#lazy-evaluation
 [2]
 https://ci.apache.org/projects/flink/flink-docs-release-0.9/apis/programming_guide.html#data-sinks
 
 On Jul 27, 2015, at 1:41 PM, bharathkarnam mailbhara...@gmail.com
 wrote:
 
 bin/flink run -c com.hello.flink.StreamData /home/a544403/Flinkstream.jar
 org.apache.flink.client.program.ProgramInvocationException: The program's
 entry point class 'com.hello.flink.StreamData' could not be loaded due
 to a
 linkage failure.
   at
 
 org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:527)
   at
 
 org.apache.flink.client.program.PackagedProgram.init(PackagedProgram.java:142)
   at
 org.apache.flink.client.CliFrontend.buildProgram(CliFrontend.java:654)
   at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256)
   at
 org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:880)
   at org.apache.flink.client.CliFrontend.main(CliFrontend.java:922)
 Caused by: java.lang.UnsupportedClassVersionError:
 com/hello/flink/StreamData : Unsupported major.minor version 52.0
   at java.lang.ClassLoader.defineClass1(Native Method)
   at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
   at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
   at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
   at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:274)
   at
 
 org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:516)
   ... 5 more
 
 here is my code
 
 package com.hello.flink;
 
 import
 org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
 import org.apache.flink.streaming.connectors.kafka.api.KafkaSource;
 import org.apache.flink.streaming.util.serialization.SimpleStringSchema;
 
 public class StreamData {
 
  public static void main(String[] args) {
 
  StreamExecutionEnvironment env =
 StreamExecutionEnvironment.getExecutionEnvironment();
 
 
  env.addSource(new KafkaSourceString(localhost:2181,
 syslog_framework, new SimpleStringSchema())).print();
 
 
  try {
  env.execute(MyJob);
  } catch (Exception e) {
  // TODO Auto-generated catch block
  e.printStackTrace();
  }
 
  }
 
 }
 
 
 
 
 
 --
 View this message in context:
 http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Getting-an-eeror-while-running-the-code-tp7154.html
 Sent from the Apache Flink Mailing List archive. mailing list archive at
 Nabble.com.

Re: Getting an eeror while running the code

2015-07-26 Thread Chiwan Park

Hi, print() method runs the program immediately. After execution, there is no 
sink in
the program. You should remove calling execute() method after calling print() 
method.

There is more detail description [1][2] in Flink documentation. I hope that 
this helps.

Regards,
Chiwan Park

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-0.9/apis/programming_guide.html#lazy-evaluation
[2] 
https://ci.apache.org/projects/flink/flink-docs-release-0.9/apis/programming_guide.html#data-sinks

 On Jul 27, 2015, at 1:41 PM, bharathkarnam mailbhara...@gmail.com wrote:
 
 bin/flink run -c com.hello.flink.StreamData /home/a544403/Flinkstream.jar
 org.apache.flink.client.program.ProgramInvocationException: The program's
 entry point class 'com.hello.flink.StreamData' could not be loaded due to a
 linkage failure.
at
 org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:527)
at
 org.apache.flink.client.program.PackagedProgram.init(PackagedProgram.java:142)
at
 org.apache.flink.client.CliFrontend.buildProgram(CliFrontend.java:654)
at org.apache.flink.client.CliFrontend.run(CliFrontend.java:256)
at
 org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:880)
at org.apache.flink.client.CliFrontend.main(CliFrontend.java:922)
 Caused by: java.lang.UnsupportedClassVersionError:
 com/hello/flink/StreamData : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at
 java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at
 org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:516)
... 5 more
 
 here is my code
 
 package com.hello.flink;
 
 import
 org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
 import org.apache.flink.streaming.connectors.kafka.api.KafkaSource;
 import org.apache.flink.streaming.util.serialization.SimpleStringSchema;
 
 public class StreamData {
 
   public static void main(String[] args) {
 
   StreamExecutionEnvironment env =
 StreamExecutionEnvironment.getExecutionEnvironment();
 
   
   env.addSource(new KafkaSourceString(localhost:2181,
 syslog_framework, new SimpleStringSchema())).print();
 
 
   try {
   env.execute(MyJob);
   } catch (Exception e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
   }
 
   }
 
 }
 
 
 
 
 
 --
 View this message in context: 
 http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Getting-an-eeror-while-running-the-code-tp7154.html
 Sent from the Apache Flink Mailing List archive. mailing list archive at 
 Nabble.com.

Re: The documentation site is cut on the top

2015-07-11 Thread Chiwan Park

AFAIK, There is no JIRA issue related this problem.

Regards,
Chiwan Park

 On Jul 12, 2015, at 2:37 AM, Henry Saputra henry.sapu...@gmail.com wrote:
 
 Ah I saw Matthias already report this. Is there a JIRA filed for this?
 
 If not I could create one.
 
 - Henry
 
 On Sat, Jul 11, 2015 at 10:32 AM, Henry Saputra henry.sapu...@gmail.com 
 wrote:
 It seemed like the documentation for 0.9 and latest is cut on the top:
 
 https://ci.apache.org/projects/flink/flink-docs-release-0.9/
 
 I remember it was ok before the release. Anyone know about changes
 made to the layout.
 
 This need to be fixed ASAP.
 
 - Henry

Re: Documentation Webpage Rendering Problem

2015-07-10 Thread Chiwan Park

Oh, I misunderstood the problem. In firefox, the problem occurs. [1]

Regards,
Chiwan Park

[1] http://imgur.com/js5nZQ1

 On Jul 10, 2015, at 9:24 PM, Vasiliki Kalavri vasilikikala...@gmail.com 
 wrote:
 
 Hi,
 
 I have the same rendering problem as Matthias in Chrome. Looks OK in Safari.
 I had seen this problem when we first introduced the new website but Ufuk
 had managed to fix it.
 I think it had to do with some css not loading properly in Chrome...
 
 -Vasia.
 
 On 10 July 2015 at 14:12, Matthias J. Sax mj...@informatik.hu-berlin.de
 wrote:
 
 It is good that it is a local problem. However, I cannot follow how it
 can be related to network or DNS setting. Can you explain in more
 detail? How can I fix the settings?
 
 -Matthias
 
 On 07/10/2015 02:06 PM, Chiwan Park wrote:
 I think that the problem is on your network or DNS setting.
 In my computer, the documentation is rendered properly. I attached the
 screenshot. [1]
 
 Regards,
 Chiwan Park
 
 [1] http://imgur.com/4mSohDQ
 
 On Jul 10, 2015, at 9:00 PM, Matthias J. Sax 
 mj...@informatik.hu-berlin.de wrote:
 
 Hi,
 
 I just encountered, that the documentation web page is not rendered
 properly. The fist lines of the text, are hidden by the menu. Please see
 the attached screenshot.
 
 It affects all pages in 0.9 and 0.10 documentation.
 
 Does anyone know how to fix it (and can do the fix)?
 
 
 -Matthias
 Screenshot - 07102015 - 01:55:24 PM.png

Re: Flink 0.9 built with Scala 2.11

2015-07-05 Thread Chiwan Park

@Stephan: Okay, I’ll find the mentionings in other document. I think that we
can postpone updating downloads page in flink-web until releasing 0.10.

@Alexandar Thank you for comments. I’ll apply your suggestions.

In your example, *flink-pure-java* is not pure java module. If there is any need
of linkage with Scala dependent module in some module, the module is also
Scala dependent module. Because we are using Scala in our runtime, all
modules are Scala dependent module.

So in your example, *flink-some-scala-A*, *flink-some-scala-B*, and
*flink-pure-java* should have a suffix `_2.11` if the user want to run in Flink
with Scala 2.11. (In Scala 2.10, we don’t need it.)

I agree that it makes too many modules. But it is clear in user perspective. The
users just decide which Scala version to use their cluster and add a suffix to
all dependency if the version is 2.11.

Regards,
Chiwan Park

 On Jul 3, 2015, at 9:26 PM, Stephan Ewen se...@apache.org wrote:
 
 @Chiwan:
 
 There are a few mentionings of the Scala version in the docs as well. For
 example in docs/index.md and on the website under downloads.
 
 We should make sure we explain on these pages that there are downloads for
 various Scala versions.
 
 Cheers,
 Stephan
 
 
 On Fri, Jul 3, 2015 at 2:01 PM, Alexander Alexandrov 
 alexander.s.alexand...@gmail.com wrote:
 
 Great, I just posted some comments / improvement suggestions.
 
 I have to say I'm still not 100% convinced by the strategy not to add a
 suffix to all modules. Here is a small example that illustrates my
 concerns.
 
 Consider the following chained dependency situation. We have pure Java
 artifact *flink-pure-java* which depends on a Scala artifact
 *flink-some-scala-A*, which in turn depends on *flink-some-scala-B*.
 
 Let's say the user has directly included *flink-pure-java* and
 *flink-some-scala-B* in the his project and wants to build for Scala 2.11.
 We end up with a situation like this
 
 - flink-pure-java
  `- flink-some-scala-A
 `- flink-some-scala-B
 - flink-some-scala-B_2.11
 
 We end up having both versions of *flink-some-scala-B* in our project.
 
 
 
 2015-07-03 12:24 GMT+02:00 Chiwan Park chiwanp...@apache.org:
 
 Hi All,
 I created a PR for this issue. [1] Please check and comment about the PR.
 
 Regards,
 Chiwan Park
 
 [1] https://github.com/apache/flink/pull/885
 
 On Jul 2, 2015, at 5:59 PM, Chiwan Park chiwanp...@apache.org wrote:
 
 @Alexander I’m happy to hear that you want to help me. If you help me,
 I
 really appreciate. :)
 
 Regards,
 Chiwan Park
 
 
 On Jul 2, 2015, at 2:57 PM, Alexander Alexandrov 
 alexander.s.alexand...@gmail.com wrote:
 
 @Chiwan: let me know if you need hands-on support. I'll be more then
 happy to help (as my downstream project is using Scala 2.11).
 
 2015-07-01 17:43 GMT+02:00 Chiwan Park chiwanp...@apache.org:
 Okay, I will apply this suggestion.
 
 Regards,
 Chiwan Park
 
 On Jul 1, 2015, at 5:41 PM, Ufuk Celebi u...@apache.org wrote:
 
 
 On 01 Jul 2015, at 10:34, Stephan Ewen se...@apache.org wrote:
 
 +1, like that approach
 
 +1
 
 I like that this is not breaking for non-Scala users :-)

Re: Flink 0.9 built with Scala 2.11

2015-07-03 Thread Chiwan Park

Hi All,
I created a PR for this issue. [1] Please check and comment about the PR.

Regards,
Chiwan Park

[1] https://github.com/apache/flink/pull/885

 On Jul 2, 2015, at 5:59 PM, Chiwan Park chiwanp...@apache.org wrote:
 
 @Alexander I’m happy to hear that you want to help me. If you help me, I 
 really appreciate. :)
 
 Regards,
 Chiwan Park
 
 
 On Jul 2, 2015, at 2:57 PM, Alexander Alexandrov 
 alexander.s.alexand...@gmail.com wrote:
 
 @Chiwan: let me know if you need hands-on support. I'll be more then happy 
 to help (as my downstream project is using Scala 2.11).
 
 2015-07-01 17:43 GMT+02:00 Chiwan Park chiwanp...@apache.org:
 Okay, I will apply this suggestion.
 
 Regards,
 Chiwan Park
 
 On Jul 1, 2015, at 5:41 PM, Ufuk Celebi u...@apache.org wrote:
 
 
 On 01 Jul 2015, at 10:34, Stephan Ewen se...@apache.org wrote:
 
 +1, like that approach
 
 +1
 
 I like that this is not breaking for non-Scala users :-)

Re: [flink-ml] How to use ParameterMap in predict method?

2015-06-30 Thread Chiwan Park

Thanks Till :)

I reimplemented my implementation using PredictDataSetOperation.

Regards,
Chiwan Park

On Jun 29, 2015, at 7:41 PM, Till Rohrmann till.rohrm...@gmail.com wrote:

Hi Chiwan,

at the moment the single element PredictOperation only supports
non-distributed models. This means that it expects the model to be a single
element DataSet which can be broadcasted to the predict mappers.

If you need more flexibility, you can either extend the PredictOperation
interface or you simply use the PredictDataSetOperation, where you have
full control over what data flow you execute.

Cheers,
Till

On Mon, Jun 29, 2015 at 12:16 PM, Chiwan Park chiwanp...@apache.org wrote:

Thank you Till.

I have another question. Can I use a DataSet object as Model? In KNN, we
need
to DataSet given in fit operation.

But when I defined Model generic parameter to DataSet in PredictOperation,
the getModel method’s return type is DataSet[DataSet]. I’m confused with
this
situation.

If any advice about this to me, I will really appreciate.

Regards,
Chiwan Park

On Jun 29, 2015, at 4:43 PM, Till Rohrmann trohrm...@apache.org wrote:

Hi Chiwan,

when you use the single element predict operation, you always have to
implement the `getModel` method. There you have access to the resulting
parameters and even to the instance to which the `PredictOperation`
belongs. Within in this `getModel` method you can initialize all the
information you need for the `predict` operation.

You can take a look at the `StandardScalerTransformOperation` [1] where
the
mean and the std are set in the `getModel` method.

Cheers,
Till

[1]

https://github.com/apache/flink/blob/master/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StandardScaler.scala#L197

On Sun, Jun 28, 2015 at 1:49 PM, Chiwan Park chiwanp...@apache.org
wrote:

Hi, I’m implementing k-nearest-neighbors classification based flink-ml
structure.

In recent commit (7a7a2940 [1]), the pipeline is restructured by
dividing
predict operation
into case of a single element and case of data set. In case of data set,
parameter map is
given as a method parameter but in case of a single element there is no
method to access
parameter map.

But in k-nearest-neighbors classification, we need to know k in predict
method to select top
k values.

How can I solve this problem?

Regards,
Chiwan Park

[1]

https://github.com/apache/flink/commit/7a7a294033ef99c596e59f670e2e4ae9262f5c5f

Re: [flink-ml] How to use ParameterMap in predict method?

2015-06-29 Thread Chiwan Park

Thank you Till.

I have another question. Can I use a DataSet object as Model? In KNN, we need
to DataSet given in fit operation.

But when I defined Model generic parameter to DataSet in PredictOperation,
the getModel method’s return type is DataSet[DataSet]. I’m confused with this
situation.

If any advice about this to me, I will really appreciate.

Regards,
Chiwan Park

On Jun 29, 2015, at 4:43 PM, Till Rohrmann trohrm...@apache.org wrote:

Hi Chiwan,

You can take a look at the `StandardScalerTransformOperation` [1] where the
mean and the std are set in the `getModel` method.

Cheers,
Till

[1]
https://github.com/apache/flink/blob/master/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StandardScaler.scala#L197

On Sun, Jun 28, 2015 at 1:49 PM, Chiwan Park chiwanp...@apache.org wrote:

Hi, I’m implementing k-nearest-neighbors classification based flink-ml
structure.

In recent commit (7a7a2940 [1]), the pipeline is restructured by dividing
predict operation
into case of a single element and case of data set. In case of data set,
parameter map is
given as a method parameter but in case of a single element there is no
method to access
parameter map.

But in k-nearest-neighbors classification, we need to know k in predict
method to select top
k values.

How can I solve this problem?

Regards,
Chiwan Park

[1]
https://github.com/apache/flink/commit/7a7a294033ef99c596e59f670e2e4ae9262f5c5f

Re: FLINK-2066

2015-06-29 Thread Chiwan Park

We should assign FLINK-2066 to Nuno. :)

Regards,
Chiwan Park

On Jun 29, 2015, at 1:21 PM, Márton Balassi balassi.mar...@gmail.com wrote:

Hey,

Thanks for picking up the issue. This value can be specified as
execution-retries.delay in the flink-conf.yaml. Hence you can check the
associated value in the ConfigConstants [1] and track the way it is used.
It is passed a couple of times, but is ultimately used in ExecutionGraph.
[2]

[1]
https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java#L54
[2]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionGraph.java#L714

Best,

Marton

On Sun, Jun 28, 2015 at 1:56 PM, Nuno Santos n.marques.san...@gmail.com
wrote:

Hi guys.

I've been digging around the docs for the last few days and I am now ready
to have a go at my first contribution.

I chose FLINK-2066 https://issues.apache.org/jira/browse/FLINK-2066 and
I
am looking for some guidance.

I understand the change will be associated to the ExecutionConfig class,
which is reference by the Execution Environment.

I made my way through the code to the LocalEnvironment and LocalExecutor.

However, besides setters and getters, I do not see where the
numberOfExecutionRetries is used in the execution itself?

Any pointers on where to look will be appreciated, I will continue to make
my way through the code to see what else I can find.

Thanks,
Nuno.

Re: Student looking to contribute to Stratosphere

2015-06-27 Thread Chiwan Park

Hi, You can choose any unassigned issue about Flink Machine Learning Library 
(flink-ml) in JIRA. [1]
There are some issues for starter in flink-ml such as FLINK-1737 [2], 
FLINK-1748 [3], FLINK-1994 [4].

First, It would be better to read some articles about contributing to Flink. 
[5][6]
And if you decide a issue to contribute, please assign it to you. If you don’t 
have permission to
assign, just comment into the issue. Then other people give permission to you 
and assign
the issue to you.

Regards,
Chiwan Park

[1] https://issues.apache.org/jira/
[2] https://issues.apache.org/jira/browse/FLINK-1737
[3] https://issues.apache.org/jira/browse/FLINK-1748
[4] https://issues.apache.org/jira/browse/FLINK-1994
[5] http://flink.apache.org/how-to-contribute.html
[6] http://flink.apache.org/coding-guidelines.html

 On Jun 27, 2015, at 11:20 PM, Rohit Shinde rohit.shinde12...@gmail.com 
 wrote:
 
 Hello everyone,
 
 I came across Stratosphere while looking for GSOC organisations working in
 Machine Learning. I got to know that it had become Apache Flink.
 
 I am interested in this project:
 https://github.com/stratosphere/stratosphere/wiki/Google-Summer-of-Code-2014#implement-one-or-multiple-machine-learning-algorithms-for-stratosphere
 
 Backgroundd: I am proficient in C++, Java, Python and Scheme. I have taken
 undergrad courses in machine learning and data mining. How can I contribute
 to the above project?
 
 Thank you,
 Rohit Shinde.

Re: Drafting the 0.9.0 release announcement

2015-06-24 Thread Chiwan Park

Great! We should post the announcement mail to user mailing list :)

Regards,
Chiwan Park

On Jun 24, 2015, at 9:22 PM, Stephan Ewen se...@apache.org wrote:

Great that this release is out, finally :-)

On Wed, Jun 24, 2015 at 2:19 PM, Maximilian Michels m...@apache.org wrote:

I've published the announcement:

http://flink.apache.org/news/2015/06/24/announcing-apache-flink-0.9.0-release.html

Let me know if anything is not right. Otherwise, please spread the word!

Cheers,
Max

On Wed, Jun 24, 2015 at 1:59 PM, Maximilian Michels m...@apache.org
wrote:

Thanks for your feedback. We managed to improve the initial document
quite
a bit. The thing about the known issues is that it might give a false
impression to people reading the release announcement because the real
issues we are working on, can hardly be summarized by a couple of JIRA
issues. Of course we will fix those JIRA issues (and I made sure this is
reflected in JIRA) but the right place for keeping track of them IMHO is
not the release announcement. With that in mind, I have removed the known
issues section.

I'm converting the document to markdown now to publish it. The website is
already set up for the 0.9.0.

On Wed, Jun 24, 2015 at 1:39 PM, Vasiliki Kalavri
vasilikikala...@gmail.com wrote:

Hi,

thank you Max for the nice document!
Can we please add FLINK-2271 to the list of known issues?

-Vasia.

On 24 June 2015 at 13:08, Stephan Ewen se...@apache.org wrote:

I like the announcement now.

Would like to add a note to Gelly and Flink ML that these are early
versions of the libraries, that they are subject to changes and not
yet
performance optimized.

On Wed, Jun 24, 2015 at 1:07 PM, Stephan Ewen se...@apache.org
wrote:

I reworked the text on streaming fault tolerance.

I think the release announcement needs to be about the core
points/improvements, rather than have many exact details. We post a
link
for details for those interested in the details.

On Wed, Jun 24, 2015 at 11:32 AM, Aljoscha Krettek
aljos...@apache.org
wrote:

I fixed a typo in the state checkpoint text: In case *of
*recovery.

On Wed, 24 Jun 2015 at 10:34 Maximilian Michels m...@apache.org
wrote:

Thanks for your contributions. Now that the release artifacts are
published
on the Maven repositories, I would like to get out the release
announcement. If nobody has anything to add, I will do a final
pass
and
publish the announcement.

On Tue, Jun 23, 2015 at 3:22 PM, Stephan Ewen se...@apache.org
wrote:

I also like a separate paragraph about it :-)

On Tue, Jun 23, 2015 at 3:20 PM, Robert Metzger
rmetz...@apache.org
wrote:

I would add a separate paragraph about it!

On Tue, Jun 23, 2015 at 2:21 PM, Timo Walther
twal...@apache.org

wrote:

Is the static code analysis thing worth it to write a
paragraph
about
it
or is better located in the More Improvements and Fixes
section?

On 23.06.2015 14:13, Márton Balassi wrote:

Thanks, Max. I would like to present the streaming stuff a
bit
differently,
updating it.

On Tue, Jun 23, 2015 at 1:59 PM, Maximilian Michels
m...@apache.org

wrote:

Hi everyone,

With the vote now being passed, we should think about the
release
announcement. I've created a document which basically
contains
the
milestone release announcement with a few changes.

https://docs.google.com/document/d/1s5H0q961Ucvu6fR56RleY4oN4YFqmeGkN77XPPsco3k/edit#heading=h.5fiopdwccz0x

Please feel free to make your comments and adjustments.

Best,
Max

Re: New contributor

2015-06-24 Thread Chiwan Park

Hi.

FLINK-2021 and FLINK-2066 are good issue to start contributing to Flink. :)
You can choose any issues which you want.

If you want contribute the issues, assign the issues to you in JIRA. If you 
don’t have permission,
just comment in JIRA, then we’ll give the permission to you.

As you know, you can contribute easily and safely with How to Contribute” 
guide[1] in web page.
But if you have questions, just post a mail to dev mailing list. We should 
reply your mail.

I hope your time spent will be enjoyable.

Regards,
Chiwan Park

[1] http://flink.apache.org/how-to-contribute.html


 On Jun 25, 2015, at 4:03 AM, Nuno Santos n.marques.san...@gmail.com wrote:
 
 Hi Ufuk.
 
  Thanks for the welcome! I have also checked those pages and will revisit
 them until it is carved into my brain. :)
 
  I found a couple of issues that seem to be a good shot, namely:
 
  https://issues.apache.org/jira/browse/FLINK-2066
 
  https://issues.apache.org/jira/browse/FLINK-2021
 
  If you have any pointers as to which one would be best to start, please
 shout.
 
  If not, I'll just pick one and start digging into this.
 
  I want to see if I can understand this well so I can get involved in the
 user mailing list as well.
 
 Thanks!
 Nuno
 
 2015-06-23 22:30 GMT+01:00 Ufuk Celebi u...@apache.org:
 
 Hey nuno! Welcome to the Flink community. :) The points you mentioned sound
 very reasonable. There is also a how to contribute guide and a coding
 guidelines document on the web page you can check out. Is there a specific
 starter issue you are interested in? Then it will be easier to give
 pointers to the respective parts of the system you can look into.
 
 – Ufuk
 
 On Tuesday, June 23, 2015, Nuno Santos n.marques.san...@gmail.com wrote:
 
 Hello everyone,
 
  I am a new contributor (well soon to be!) to this project and I am just
 getting to know the ropes around the framework.
 
  I love what flink is about which is what makes me want to contribute to
 it.
 
  I have read the documentation and will still re-read it a couple more
 times. I am glancing now over the Jira issues for easy starters to see
 what
 I can pick up.
 
  Besides these steps, could any one give me pointers as to where to look
 in the code in order to understand it better?
 
  I know tests are a great example of that, are there any classes in
 particular I should look into so I can start to get a better grasp of the
 framework as a whole?
 
 Many thanks,
 Nuno santos.

Re: execute() and collect()/print()/count()

2015-06-19 Thread Chiwan Park

+1 for ignoring execute() call with warning.

But I'm concerned for how the user catches the error in program without any 
data sinks.

By the way, eager execution is not well documented in data sinks section but is 
in program
skeleton section. [1] This makes the user’s confusion. We should clean up 
documents.
There are many codes calling execute() method after print() method. [2][3]

We should add a description for count() method to documents too.

[1] 
http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#data-sinks
[2] 
http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#parallel-execution
[3] 
http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#iteration-operators

Regards,
Chiwan Park

 On Jun 19, 2015, at 9:15 PM, Maximilian Michels m...@apache.org wrote:
 
 Dear Flink community,
 
 I have stopped to count how many people on the user list and during Flink
 trainings have asked why their Flink program throws an Exception when they
 just one to print a DataSet. The reason for this is that print() now
 executes eagerly, thus, executes the Flink program. Subsequent calls to
 execute() need to define new DataSinks and throw an exception otherwise.
 
 We have recently introduced a flag in the ExecutionEnvironment that checks
 whether the user executed before (explicitly via execute() or implicitly
 through collect()/print()/count()). That enabled us to print a nicer
 exception message. However, users either do not read the exception message
 or do not understand it. They do ask this question a lot.
 
 That's why I propose to ignore calls to execute() entirely if no sinks are
 defined. That will get rid of one of the core annoyances for Flink users. I
 know, that this is painfully for us programmers because we understand how
 Flink works internally but let's step back once and see that it wouldn't be
 so bad if execute didn't do anything in case of no new sinks.
 
 What would be the downside of this change? Users might call execute() and
 wonder that nothing happens. We would then simply print a warning that
 their program didn't define any sinks. That is a big difference to the
 behavior before because users are scared of exceptions. If they just get a
 warning they will double-check their program and investigate why nothing
 happens. Most of the cases they do actually have defined sinks but simply
 left a call to execute() when they were printing a DataSet.
 
 What are you opinions on this issue? I have opened a JIRA for this as well:
 https://issues.apache.org/jira/browse/FLINK-2249
 
 Best,
 Max

Re: Quickstart POMs

2015-06-18 Thread Chiwan Park

Is it okay when the user runs the built jar in LocalEnvironment? (Just run with 
`java -jar` command.)
I know that it is special case but it is a possible scenario for local testing.

If we change Quickstart POM to use provided scope for dependencies, we should 
add a guide about this into document.

Regards,
Chiwan Park

 On Jun 19, 2015, at 12:53 AM, Aljoscha Krettek aljos...@apache.org wrote:
 
 I'm also for simplification but let's hear what those who put the build-jar
 profile there have to say about it.?
 
 On Thu, 18 Jun 2015 at 17:25 Ufuk Celebi u...@apache.org wrote:
 
 
 On 18 Jun 2015, at 16:58, Fabian Hueske fhue...@gmail.com wrote:
 
 Why?
 
 mvn package
 
 builds the program correctly, no?
 
 Yes, but:
 
 - Dependencies not specified by the user may be included (Metrics,
 javaassist)
 - Dependencies specified by the user may be excluded
 - If you use the build-jar profile you have to understand what the
 difference to the default profile is and then you have to include your
 dependencies again for the profile
 - The pom comments are confusing

Re: Testing Apache Flink 0.9.0-rc1

2015-06-09 Thread Chiwan Park

I attached jps and jstack log about hanging 
TaskManagerFailsWithSlotSharingITCase to JIRA FLINK-2183.

Regards,
Chiwan Park

 On Jun 10, 2015, at 12:28 AM, Aljoscha Krettek aljos...@apache.org wrote:
 
 I discovered something that might be a feature, rather than a bug. When you
 submit an example using the web client without giving parameters the
 program fails with this:
 
 org.apache.flink.client.program.ProgramInvocationException: The main method
 caused an error.
 
 at
 org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:452)
 
 at
 org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
 
 at org.apache.flink.client.program.Client.run(Client.java:315)
 
 at
 org.apache.flink.client.web.JobSubmissionServlet.doGet(JobSubmissionServlet.java:302)
 
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
 
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:770)
 
 at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:532)
 
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
 
 at
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:227)
 
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:965)
 
 at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:388)
 
 at
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:187)
 
 at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:901)
 
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
 
 at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:47)
 
 at
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)
 
 at org.eclipse.jetty.server.Server.handle(Server.java:352)
 
 at
 org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:596)
 
 at
 org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
 
 at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:549)
 
 at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:211)
 
 at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:425)
 
 at
 org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:489)
 
 at
 org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)
 
 at java.lang.Thread.run(Thread.java:745)
 
 Caused by: java.lang.NullPointerException
 
 at
 org.apache.flink.api.common.JobExecutionResult.getAccumulatorResult(JobExecutionResult.java:78)
 
 at org.apache.flink.api.java.DataSet.collect(DataSet.java:409)
 
 at org.apache.flink.api.java.DataSet.print(DataSet.java:1345)
 
 at
 org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:80)
 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
 at java.lang.reflect.Method.invoke(Method.java:497)
 
 at
 org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
 
 ... 24 more
 
 
 This also only occurs when you uncheck the suspend execution while showing
 plan.
 
 I think this arises because the new print() uses collect() which tries to
 get the job execution result. I guess the result is Null since the job is
 submitted asynchronously when the checkbox is unchecked.
 
 
 Other than that, the new print() is pretty sweet when you run the builtin
 examples from the CLI. You get all the state changes and also the result,
 even when running in cluster mode on several task managers. :D
 
 
 On Tue, Jun 9, 2015 at 3:41 PM, Aljoscha Krettek aljos...@apache.org
 wrote:
 
 I discovered another problem:
 https://issues.apache.org/jira/browse/FLINK-2191 The closure cleaner
 cannot be disabled in part of the Streaming Java API and all of the
 Streaming Scala API. I think this is a release blocker (in addition
 the the other bugs found so far.)
 
 On Tue, Jun 9, 2015 at 2:35 PM, Aljoscha Krettek aljos...@apache.org
 wrote:
 I found the bug in the failing YARNSessionFIFOITCase: It was comparing
 the hostname to a hostname in some yarn config. In one case it was
 capitalised, in the other case it wasn't.
 
 Pushing fix to master and release-0.9 branch.
 
 On Tue, Jun 9, 2015 at 2:18 PM, Sachin Goel sachingoel0...@gmail.com
 wrote:
 A re-ran lead to reproducibility of 11 failures again.
 TaskManagerTest.testSubmitAndExecuteTask was failing with a time-out but
 managed to succeed in a re-run. Here is the log output again:
 http://pastebin.com/raw.php?i=N4cm1J18
 
 Setup: JDK 1.8.0_40 on windows 8.1
 System memory: 8GB, quad-core with maximum 8 threads.
 
 Regards
 Sachin Goel
 
 On Tue, Jun 9, 2015 at 5:34 PM, Ufuk Celebi u...@apache.org wrote:
 
 
 On 09 Jun 2015

Re: Checkstyle in IntelliJ

2015-06-09 Thread Chiwan Park

Hi. IntelliJ IDEA can reformat only changed code.
In Reformat File dialog (you can open this by pressing Alt + Cmd + L in OS X), 
you can choose scope to reformat style.
After the scope is changed, you can reformat only changed code without opening 
the dialog.

Following links will be helpful.

https://www.jetbrains.com/idea/help/reformat-code-dialog.html
http://imgur.com/muEVEZT

Regards,
Chiwan Park

 On Jun 9, 2015, at 8:39 PM, Matthias J. Sax mj...@informatik.hu-berlin.de 
 wrote:
 
 On side comment:
 
 Eclipse allows to auto format on save and apply the formating rules to
 changed lines only. Using this feature, it is possible to be style
 compatible without reformatting unchanged code. Having a format.xml for
 Eclipse, would help a lot to get a unique code style. The change would
 be applied step-be-step in every commit to the changes lines only.
 
 I personally would love to have this. Not sure if Intellij has a similar
 Feature.
 
 -Matthias
 
 On 06/09/2015 12:44 PM, Till Rohrmann wrote:
 But then we should also provide a code style profile for Eclipse and have
 to keep them in sync.
 
 On Tue, Jun 9, 2015 at 12:33 PM Aljoscha Krettek aljos...@apache.org
 wrote:
 
 code_scheme name=Flink
  option name=CLASS_COUNT_TO_USE_IMPORT_ON_DEMAND value=100 /
  option name=RIGHT_MARGIN value=100 /
  XML
option name=XML_LEGACY_SETTINGS_IMPORTED value=true /
  /XML
  codeStyleSettings language=JAVA
option name=ALIGN_MULTILINE_PARAMETERS value=false /
option name=CALL_PARAMETERS_WRAP value=5 /
option name=METHOD_PARAMETERS_WRAP value=5 /
indentOptions
  option name=USE_TAB_CHARACTER value=true /
  option name=SMART_TABS value=true /
/indentOptions
  /codeStyleSettings
 /code_scheme
 
 This is the contents of Flink.xml in ~/Library/Preferences/IdeaIC14/
 codestyles which is the folder for codestyles on OS X. It is pretty much
 the standard IntelliJ code style except that I changed it not to align in
 parameter lists. So it seems possible to get rid of the alignment. Maybe we
 can tweak such an IntelliJ code style and put it on the website somewhere.
 
 On Tue, Jun 9, 2015 at 12:10 PM, Pieter-Jan Van Aeken 
 pieterjan.vanae...@euranova.eu wrote:
 
 Hi Aljoscha,
 
 Yes, I get the style errors in my IDE (although I set the level to
 warning rather than error). I try to pay close attention to writing my
 code without checkstyle errors but I simply cannot resist pressing
 auto format shortkey every now and then. That way all my effort into
 writing properly styled code goes undone.
 
 I am modifying my auto format settings to prevent this and it works
 for Scala but I have not been able to do this for Java. Whenever a
 line gets wrapped in Java, IntelliJ auto aligns the next line, and
 uses spaces to do so when the required indent is not dividable by 4.
 
 Regards,
 
 Pieter-Jan Van Aeken
 
 Op Dinsdag, 09/06/2015 om 12:04 schreef Aljoscha Krettek:
 
 By the way, do you have the Flink checkstyle and scalastyle profiles
 set in IntelliJ? This way you at least get red errors directly in the
 IDE. For checkstyle there is Checkstyle-IDEA and for scalastyle you
 can put the scalastyle config of Flink into the .idea directory to
 have it recognised:
 
 cp tools/maven/scalastyle-config.xml .idea/scalastyle_config.xml
 
 On Tue, Jun 9, 2015 at 11:55 AM, Maximilian Michels  wrote:
 Hi Pieter-Jan,
 
 It would be great to have a plugin for IntelliJ/Eclipse to make new
 code
 stylecheck-compliant. However, as Till mentioned, the problem is
 that most
 such plugins touch more lines than necessary. We try to only commit
 changes
 to the Git repository which are related to the feature/pull request.
 That
 way, commits are more readable and code fragments can be more easily
 attributed to the person that originally created it (instead of the
 one
 reformatting it).
 
 Let us know if you find a useful plugin or method to deal with the
 mentioned problems.
 
 Best regards,
 Max
 
 On Tue, Jun 9, 2015 at 11:30 AM, Pieter-Jan Van Aeken 
 pieterjan.vanae...@euranova.eu wrote:
 
 Hi Till,
 
 If I recall correctly, there is a possibility to import checkstyle
 XML's into Eclipse so that the auto format feature would result in
 style compliant code. This imported Eclipse config could then be
 exported and reimported into IntelliJ but you can imagine that is
 not
 a reason for me to install Eclipse.
 
 That being said, I understand your concerns with auto-format but it
 also has its benefits. I've used auto format succesfully to ensure
 maximum line length, removal of star imports, ... The only thing I
 had
 an issue with was leading spaces when wrapping lines. I just
 removed
 manually about 100 leading spaces but if I auto format again (it's
 a
 hard habbit to get rid off) I will have to do the same thing all
 over
 again. After a while it just becomes silly and a real waste of
 development time.
 
 If we were to provide a common Eclipse and IntelliJ style config,
 we
 could resolve all the style issues

Re: Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Chiwan Park

Hi. I have a problem running `mvn clean verify` command.
TaskManagerFailsWithSlotSharingITCase hangs in Oracle JDK 7 (1.7.0_80). But in 
Oracle JDK 8 the test case doesn’t hang.

I’ve investigated about this problem but I cannot found the bug.

Regards,
Chiwan Park

 On Jun 9, 2015, at 2:11 AM, Márton Balassi balassi.mar...@gmail.com wrote:
 
 Added F7 Running against Kafka cluster for me in the doc. Doing it
 tomorrow.
 
 On Mon, Jun 8, 2015 at 7:00 PM, Chiwan Park chiwanp...@icloud.com wrote:
 
 Hi. I’m very excited about preparing a new major release. :)
 I just picked two tests. I will report status as soon as possible.
 
 Regards,
 Chiwan Park
 
 On Jun 9, 2015, at 1:52 AM, Maximilian Michels m...@apache.org wrote:
 
 Hi everyone!
 
 As previously discussed, the Flink developer community is very eager to
 get
 out a new major release. Apache Flink 0.9.0 will contain lots of new
 features and many bugfixes. This time, I'll try to coordinate the release
 process. Feel free to correct me if I'm doing something wrong because I
 don't no any better :)
 
 To release a great version of Flink to the public, I'd like to ask
 everyone
 to test the release candidate. Recently, Flink has received a lot of
 attention. The expectations are quite high. Only through thorough testing
 we will be able to satisfy all the Flink users out there.
 
 Below is a list from the Wiki that we use to ensure the legal and
 functional aspects of a release [1]. What I would like you to do is pick
 at
 least one of the tasks, put your name as assignee in the link below, and
 report back once you verified it. That way, I hope we can quickly and
 thoroughly test the release candidate.
 
 
 https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
 
 Best,
 Max
 
 Git branch: release-0.9-rc1
 Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
 Maven artifacts:
 https://repository.apache.org/content/repositories/orgapacheflink-1037/
 PGP public key for verifying the signatures:
 http://pgp.mit.edu/pks/lookup?op=vindexsearch=0xDE976D18C2909CBF
 
 
 Legal
 
 
 L.1 Check if checksums and GPG files match the corresponding release
 files
 
 L.2 Verify that the source archives do NOT contains any binaries
 
 L.3 Check if the source release is building properly with Maven
 (including
 license header check (default) and checkstyle). Also the tests should be
 executed (mvn clean verify)
 
 L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
 source release.
 
 L.5 All dependencies must be checked for their license and the license
 must
 be ASL 2.0 compatible (
 http://www.apache.org/legal/resolved.html#category-x)
 * The LICENSE and NOTICE files in the root directory refer to
 dependencies
 in the source release, i.e., files in the git repository (such as fonts,
 css, JavaScript, images)
 * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
 the binary distribution and mention all of Flink's Maven dependencies as
 well
 
 L.6 Check that all POM files point to the same version (mostly relevant
 to
 examine quickstart artifact files)
 
 L.7 Read the README.md file
 
 
 Functional
 
 
 F.1 Run the start-local.sh/start-local-streaming.sh,
 start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts
 and
 verify that the processes come up
 
 F.2 Examine the *.out files (should be empty) and the log files (should
 contain no exceptions)
 * Test for Linux, OS X, Windows (for Windows as far as possible, not all
 scripts exist)
 * Shutdown and verify there are no exceptions in the log output (after
 shutdown)
 * Check all start+submission scripts for paths with and without spaces
 (./bin/* scripts are quite fragile for paths with spaces)
 
 F.3 local mode (start-local.sh, see criteria below)
 F.4 cluster mode (start-cluster.sh, see criteria below)
 F.5 multi-node cluster (can simulate locally by starting two
 taskmanagers,
 see criteria below)
 
 Criteria for F.3 F.4 F.5
 
 * Verify that the examples are running from both ./bin/flink and from the
 web-based job submission tool
 * flink-conf.yml should define more than one task slot
 * Results of job are produced and correct
 ** Check also that the examples are running with the build-in data and
 external sources.
 * Examine the log output - no error messages should be encountered
 ** Web interface shows progress and finished job in history
 
 
 F.6 Test on a cluster with HDFS.
 * Check that a good amount of input splits is read locally (JobManager
 log
 reveals local assignments)
 
 F.7 Test against a Kafka installation
 
 F.8 Test the ./bin/flink command line client
 * Test info option, paste the JSON into the plan visualizer HTML file,
 check that plan is rendered
 * Test the parallelism flag (-p) to override the configured default
 parallelism
 
 F.9 Verify the plan visualizer with different browsers/operating systems
 
 F.10 Verify that the quickstarts for scala

Re: Testing Apache Flink 0.9.0-rc1

2015-06-08 Thread Chiwan Park

Hi. I’m very excited about preparing a new major release. :)
I just picked two tests. I will report status as soon as possible.

Regards,
Chiwan Park

 On Jun 9, 2015, at 1:52 AM, Maximilian Michels m...@apache.org wrote:
 
 Hi everyone!
 
 As previously discussed, the Flink developer community is very eager to get
 out a new major release. Apache Flink 0.9.0 will contain lots of new
 features and many bugfixes. This time, I'll try to coordinate the release
 process. Feel free to correct me if I'm doing something wrong because I
 don't no any better :)
 
 To release a great version of Flink to the public, I'd like to ask everyone
 to test the release candidate. Recently, Flink has received a lot of
 attention. The expectations are quite high. Only through thorough testing
 we will be able to satisfy all the Flink users out there.
 
 Below is a list from the Wiki that we use to ensure the legal and
 functional aspects of a release [1]. What I would like you to do is pick at
 least one of the tasks, put your name as assignee in the link below, and
 report back once you verified it. That way, I hope we can quickly and
 thoroughly test the release candidate.
 
 https://docs.google.com/document/d/1BhyMPTpAUYA8dG8-vJ3gSAmBUAa0PBSRkxIBPsZxkLs/edit
 
 Best,
 Max
 
 Git branch: release-0.9-rc1
 Release binaries: http://people.apache.org/~mxm/flink-0.9.0-rc1/
 Maven artifacts:
 https://repository.apache.org/content/repositories/orgapacheflink-1037/
 PGP public key for verifying the signatures:
 http://pgp.mit.edu/pks/lookup?op=vindexsearch=0xDE976D18C2909CBF
 
 
 Legal
 
 
 L.1 Check if checksums and GPG files match the corresponding release files
 
 L.2 Verify that the source archives do NOT contains any binaries
 
 L.3 Check if the source release is building properly with Maven (including
 license header check (default) and checkstyle). Also the tests should be
 executed (mvn clean verify)
 
 L.4 Verify that the LICENSE and NOTICE file is correct for the binary and
 source release.
 
 L.5 All dependencies must be checked for their license and the license must
 be ASL 2.0 compatible (http://www.apache.org/legal/resolved.html#category-x)
 * The LICENSE and NOTICE files in the root directory refer to dependencies
 in the source release, i.e., files in the git repository (such as fonts,
 css, JavaScript, images)
 * The LICENSE and NOTICE files in flink-dist/src/main/flink-bin refer to
 the binary distribution and mention all of Flink's Maven dependencies as
 well
 
 L.6 Check that all POM files point to the same version (mostly relevant to
 examine quickstart artifact files)
 
 L.7 Read the README.md file
 
 
 Functional
 
 
 F.1 Run the start-local.sh/start-local-streaming.sh,
 start-cluster.sh/start-cluster-streaming.sh, start-webclient.sh scripts and
 verify that the processes come up
 
 F.2 Examine the *.out files (should be empty) and the log files (should
 contain no exceptions)
 * Test for Linux, OS X, Windows (for Windows as far as possible, not all
 scripts exist)
 * Shutdown and verify there are no exceptions in the log output (after
 shutdown)
 * Check all start+submission scripts for paths with and without spaces
 (./bin/* scripts are quite fragile for paths with spaces)
 
 F.3 local mode (start-local.sh, see criteria below)
 F.4 cluster mode (start-cluster.sh, see criteria below)
 F.5 multi-node cluster (can simulate locally by starting two taskmanagers,
 see criteria below)
 
 Criteria for F.3 F.4 F.5
 
 * Verify that the examples are running from both ./bin/flink and from the
 web-based job submission tool
 * flink-conf.yml should define more than one task slot
 * Results of job are produced and correct
 ** Check also that the examples are running with the build-in data and
 external sources.
 * Examine the log output - no error messages should be encountered
 ** Web interface shows progress and finished job in history
 
 
 F.6 Test on a cluster with HDFS.
 * Check that a good amount of input splits is read locally (JobManager log
 reveals local assignments)
 
 F.7 Test against a Kafka installation
 
 F.8 Test the ./bin/flink command line client
 * Test info option, paste the JSON into the plan visualizer HTML file,
 check that plan is rendered
 * Test the parallelism flag (-p) to override the configured default
 parallelism
 
 F.9 Verify the plan visualizer with different browsers/operating systems
 
 F.10 Verify that the quickstarts for scala and java are working with the
 staging repository for both IntelliJ and Eclipse.
 * In particular the dependencies of the quickstart project need to be set
 correctly and the QS project needs to build from the staging repository
 (replace the snapshot repo URL with the staging repo URL)
 * The dependency tree of the QuickStart project must not contain any
 dependencies we shade away upstream (guava, netty, ...)
 
 F.11 Run examples on a YARN cluster
 
 F.12 Run all examples from the IDE (Eclipse  IntelliJ)
 
 F.13 Run

Re: pull request for FLINK-2155 documentation

2015-06-04 Thread Chiwan Park

Hi. You should send your PR to apache/flink-web repository not your flink-web 
repository.

Regards,
Chiwan Park

 On Jun 5, 2015, at 2:46 PM, Lokesh Rajaram rajaram.lok...@gmail.com wrote:
 
 Hello,
 
 For JIRA FLINK-2155 updated the document and created a pull request with
 flink-web project as https://github.com/lokeshrajaram/flink-web/pull/1
 
 I followed how to contrinute to create this pull request. can someone
 please help me verify if this pull request is the right way of updating
 flink-web documents.
 
 Thanks in advance.
 
 Thanks,
 Lokesh

Re: ALS implementation

2015-06-04 Thread Chiwan Park

I think that the NPE in second condition is bug in HashTable.
I just found that ConnectedComponents with small memory segments causes same 
error. (I thought I fixed the bug, but It is still alive.)

Regards,
Chiwan Park
 
 On Jun 5, 2015, at 2:35 AM, Felix Neutatz neut...@googlemail.com wrote:
 
 now the question is, which join in the ALS implementation is the problem :)
 
 2015-06-04 19:09 GMT+02:00 Andra Lungu lungu.an...@gmail.com:
 
 Hi Felix,
 
 Passing a JoinHint to your function should help.
 see:
 
 http://mail-archives.apache.org/mod_mbox/flink-user/201504.mbox/%3ccanc1h_vffbqyyiktzcdpihn09r4he4oluiursjnci_rwc+c...@mail.gmail.com%3E
 
 Cheers,
 Andra
 
 On Thu, Jun 4, 2015 at 7:07 PM, Felix Neutatz neut...@googlemail.com
 wrote:
 
 after bug fix:
 
 for 100 blocks and standard jvm heap space
 
 Caused by: java.lang.RuntimeException: Hash join exceeded maximum number
 of
 recursions, without reducing partitions enough to be memory resident.
 Probably cause: Too many duplicate keys.
 at
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:718)
 at
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:506)
 at
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:543)
 at
 
 
 org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
 at
 org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
 at
 
 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
 at
 
 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
 at java.lang.Thread.run(Thread.java:745)
 
 
 for 150 blocks and 5G jvm heap space
 
 Caused by: java.lang.NullPointerException
 at
 
 
 org.apache.flink.runtime.operators.hash.HashPartition.spillPartition(HashPartition.java:310)
 ...
 
 Best regards,
 Felix
 
 2015-06-04 10:19 GMT+02:00 Felix Neutatz neut...@googlemail.com:
 
 Yes, I will try it again with the newest update :)
 
 2015-06-04 10:17 GMT+02:00 Till Rohrmann till.rohrm...@gmail.com:
 
 If the first error is not fixed by Chiwans PR, then we should create a
 JIRA
 for it to not forget it.
 
 @Felix: Chiwan's PR is here [1]. Could you try to run ALS again with
 this
 version?
 
 Cheers,
 Till
 
 [1] https://github.com/apache/flink/pull/751
 
 On Thu, Jun 4, 2015 at 10:10 AM, Chiwan Park chiwanp...@icloud.com
 wrote:
 
 Hi. The second bug is fixed by the recent change in PR.
 But there is just no test case for first bug.
 
 Regards,
 Chiwan Park
 
 On Jun 4, 2015, at 5:09 PM, Ufuk Celebi u...@apache.org wrote:
 
 I think both are bugs. They are triggered by the different memory
 configurations.
 
 @chiwan: is the 2nd error fixed by your recent change?
 
 @felix: if yes, can you try the 2nd run again with the changes?
 
 On Thursday, June 4, 2015, Felix Neutatz neut...@googlemail.com
 wrote:
 
 Hi,
 
 I played a bit with the ALS recommender algorithm. I used the
 movielens
 dataset:
 
 http://files.grouplens.org/datasets/movielens/ml-latest-README.html
 
 The rating matrix has 21.063.128 entries (ratings).
 
 I run the algorithm with 3 configurations:
 
 1. standard jvm heap space:
 
 val als = ALS()
  .setIterations(10)
  .setNumFactors(10)
  .setBlocks(100)
 
 throws:
 java.lang.RuntimeException: Hash Join bug in memory management:
 Memory
 buffers leaked.
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:733)
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:541)
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
 at
 
 org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
 at java.lang.Thread.run(Thread.java:745)
 
 2. 5G jvm heap space
 
 val als = ALS()
  .setIterations(10)
  .setNumFactors(10)
  .setBlocks(150)
 
 throws:
 
 java.lang.NullPointerException
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.hash.HashPartition.spillPartition(HashPartition.java:310)
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1090)
 at
 
 
 
 
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:923

Re: ALS implementation

2015-06-04 Thread Chiwan Park

Hi. The second bug is fixed by the recent change in PR.
But there is just no test case for first bug.

Regards,
Chiwan Park

 On Jun 4, 2015, at 5:09 PM, Ufuk Celebi u...@apache.org wrote:
 
 I think both are bugs. They are triggered by the different memory
 configurations.
 
 @chiwan: is the 2nd error fixed by your recent change?
 
 @felix: if yes, can you try the 2nd run again with the changes?
 
 On Thursday, June 4, 2015, Felix Neutatz neut...@googlemail.com wrote:
 
 Hi,
 
 I played a bit with the ALS recommender algorithm. I used the movielens
 dataset:
 http://files.grouplens.org/datasets/movielens/ml-latest-README.html
 
 The rating matrix has 21.063.128 entries (ratings).
 
 I run the algorithm with 3 configurations:
 
 1. standard jvm heap space:
 
 val als = ALS()
   .setIterations(10)
   .setNumFactors(10)
   .setBlocks(100)
 
 throws:
 java.lang.RuntimeException: Hash Join bug in memory management: Memory
 buffers leaked.
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:733)
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:541)
 at
 
 org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
 at org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
 at
 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
 at
 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
 at java.lang.Thread.run(Thread.java:745)
 
 2. 5G jvm heap space
 
 val als = ALS()
   .setIterations(10)
   .setNumFactors(10)
   .setBlocks(150)
 
 throws:
 
 java.lang.NullPointerException
 at
 
 org.apache.flink.runtime.operators.hash.HashPartition.spillPartition(HashPartition.java:310)
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.spillPartition(MutableHashTable.java:1090)
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.insertBucketEntry(MutableHashTable.java:923)
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:779)
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
 at
 
 org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:541)
 at
 
 org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
 at org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
 at
 
 org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
 at
 
 org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
 at java.lang.Thread.run(Thread.java:745)
 
 3. 14G jvm heap space
 
 val als = ALS()
   .setIterations(10)
   .setNumFactors(10)
   .setBlocks(150)
   .setTemporaryPath(/tmp/tmpALS)
 
 - works
 
 Is this a Flink problem or is it just my bad configuration?
 
 Best regards,
 Felix

Re: question please

2015-05-22 Thread Chiwan Park

Hi.

Hadoop is a framework for reliable, scalable, distributed computing. So, there 
are many components for this purpose such as HDFS, YARN and Hadoop MapReduce. 
Flink is an alternative to Hadoop MapReduce component. It has also some tools 
to make map-reduce program and extends it to support many operations.

You can see more detail description in Flink’s Homepage[1]

[1] http://flink.apache.org/faq.html#is-flink-a-hadoop-project 
http://flink.apache.org/faq.html#is-flink-a-hadoop-project


Regards.
Chiwan Park


 On May 22, 2015, at 3:02 PM, Eng Fawzya eng.faw...@gmail.com wrote:
 
 hi,
 i want to know what is the difference between FLink and Hadoop?
 
 -- 
 Fawzya Ramadan Sayed,
 Teaching Assistant,
 Computer Science Department,
 Faculty of Computers and Information,
 Fayoum University

Re: Problems building the current master

2015-05-19 Thread Chiwan Park

Hi.

I think that you are building in encrypted file system such as ecryptfs.
Some encrypted file systems do not support long file name, but scala classes 
have long file name frequently.

You can choose two options to solve this problem.

1. Build in non-encrypted file system.
2. Add `-Xmax-classfile-name` args into configuration of scald-maven-plugin in 
pom.xml to restrict file name length. Following is example:

arg-Xmax-classfile-name/arg
arg128/arg


Regards.
Chiwan Park (Sent with iPhone)


 On May 19, 2015, at 11:53 PM, Tamara Mendt tammyme...@gmail.com wrote:
 
 Hello,
 
 I am getting errors when trying to build the current master and I wonder if
 anyone has had the same problem or could help me figure it out.
 
 @aalexandrov says he is having the same issue.
 
 I am using
 Maven Version: 3.0.5
 Java version: 1.8.0_45
 
 When I try to compile I get following Errors and the build fails:
 
 [ERROR] error:
 [INFO]  while compiling: YarnTaskManager.scala
 ...
 [ERROR] error: File name too long
 ...
 [ERROR] Failed to execute goal
 net.alchim31.maven:scala-maven-plugin:3.1.4:compile (scala-compile-first)
 on project flink-yarn: wrap: org.apache.commons.exec.ExecuteException:
 Process exited with an error: 1 (Exit value: 1) - [Help 1]
 
 Cheers,
 
 Tamara

Re: New project website

2015-05-11 Thread Chiwan Park

Great! :)
+1 for this version.

Regards.
Chiwan Park (Sent with iPhone)



 On May 11, 2015, at 4:51 PM, Ufuk Celebi u...@apache.org wrote:
 
 Hey all,
 
 I reworked the project website the last couple of days and would like to 
 share the preview:
 
 http://uce.github.io/flink-web/
 
 I would like to get this in asap. We can push incremental updates at any 
 time, but I think this version is a big improvement over the current status 
 quo. If I get some +1s I'll go ahead and update the website today. 
 
 – Ufuk

[jira] [Created] (FLINK-2001) DistanceMetric cannot be serialized

2015-05-11 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-2001:
--

 Summary: DistanceMetric cannot be serialized
 Key: FLINK-2001
 URL: https://issues.apache.org/jira/browse/FLINK-2001
 Project: Flink
  Issue Type: Bug
  Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Chiwan Park
Priority: Critical


Because DistanceMeasure trait doesn't extend Serializable, The task using 
DistanceMeasure raises a following exception.

{code}
Task not serializable
org.apache.flink.api.common.InvalidProgramException: Task not serializable
at 
org.apache.flink.api.scala.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:179)
at 
org.apache.flink.api.scala.ClosureCleaner$.clean(ClosureCleaner.scala:171)
at org.apache.flink.api.scala.DataSet.clean(DataSet.scala:123)
at org.apache.flink.api.scala.DataSet$$anon$10.init(DataSet.scala:691)
at org.apache.flink.api.scala.DataSet.combineGroup(DataSet.scala:690)
at org.apache.flink.ml.classification.KNNModel.transform(KNN.scala:78)
at 
org.apache.flink.ml.classification.KNNITSuite$$anonfun$1.apply$mcV$sp(KNNSuite.scala:25)
at 
org.apache.flink.ml.classification.KNNITSuite$$anonfun$1.apply(KNNSuite.scala:12)
at 
org.apache.flink.ml.classification.KNNITSuite$$anonfun$1.apply(KNNSuite.scala:12)
at 
org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
at org.scalatest.Transformer.apply(Transformer.scala:22)
at org.scalatest.Transformer.apply(Transformer.scala:20)
at org.scalatest.FlatSpecLike$$anon$1.apply(FlatSpecLike.scala:1647)
at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
at org.scalatest.FlatSpec.withFixture(FlatSpec.scala:1683)
at 
org.scalatest.FlatSpecLike$class.invokeWithFixture$1(FlatSpecLike.scala:1644)
at 
org.scalatest.FlatSpecLike$$anonfun$runTest$1.apply(FlatSpecLike.scala:1656)
at 
org.scalatest.FlatSpecLike$$anonfun$runTest$1.apply(FlatSpecLike.scala:1656)
at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
at org.scalatest.FlatSpecLike$class.runTest(FlatSpecLike.scala:1656)
at 
org.apache.flink.ml.classification.KNNITSuite.org$scalatest$BeforeAndAfter$$super$runTest(KNNSuite.scala:9)
at org.scalatest.BeforeAndAfter$class.runTest(BeforeAndAfter.scala:200)
at 
org.apache.flink.ml.classification.KNNITSuite.runTest(KNNSuite.scala:9)
at 
org.scalatest.FlatSpecLike$$anonfun$runTests$1.apply(FlatSpecLike.scala:1714)
at 
org.scalatest.FlatSpecLike$$anonfun$runTests$1.apply(FlatSpecLike.scala:1714)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:390)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:427)
at 
org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
at 
org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396)
at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483)
at org.scalatest.FlatSpecLike$class.runTests(FlatSpecLike.scala:1714)
at org.scalatest.FlatSpec.runTests(FlatSpec.scala:1683)
at org.scalatest.Suite$class.run(Suite.scala:1424)
at 
org.scalatest.FlatSpec.org$scalatest$FlatSpecLike$$super$run(FlatSpec.scala:1683)
at 
org.scalatest.FlatSpecLike$$anonfun$run$1.apply(FlatSpecLike.scala:1760)
at 
org.scalatest.FlatSpecLike$$anonfun$run$1.apply(FlatSpecLike.scala:1760)
at org.scalatest.SuperEngine.runImpl(Engine.scala:545)
at org.scalatest.FlatSpecLike$class.run(FlatSpecLike.scala:1760)
at 
org.apache.flink.ml.classification.KNNITSuite.org$scalatest$BeforeAndAfter$$super$run(KNNSuite.scala:9)
at org.scalatest.BeforeAndAfter$class.run(BeforeAndAfter.scala:241)
at org.apache.flink.ml.classification.KNNITSuite.run(KNNSuite.scala:9)
at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:55)
at 
org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$3.apply(Runner.scala:2563)
at 
org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$3.apply(Runner.scala:2557)
at scala.collection.immutable.List.foreach

Re: Migrating our website from SVN to Git

2015-04-30 Thread Chiwan Park

Great! :)

Regards.
Chiwan Park (Sent with iPhone)


 On Apr 30, 2015, at 6:52 PM, Fabian Hueske fhue...@gmail.com wrote:
 
 excellent! :-)
 
 2015-04-30 11:47 GMT+02:00 Stephan Ewen se...@apache.org:
 
 git for the win!
 
 On Thu, Apr 30, 2015 at 11:39 AM, Robert Metzger rmetz...@apache.org
 wrote:
 
 Great, thank you for taking care of this.
 
 On Thu, Apr 30, 2015 at 11:29 AM, Maximilian Michels m...@apache.org
 wrote:
 
 Hi everyone,
 
 As of today [0], the ASF officially offers Git-based repositories for
 the
 project websites. I filed a JIRA [1] to get us a Git repository for our
 website. I would assume that everyone likes the idea of switching to
 Git.
 If not, please raise your objections.
 
 The new repository is already created [2]. Of course, we won't switch
 immediately. The only change for your workflow seems to be that the
 actual
 published files are in a directory called content instead of site.
 
 If we're ready and the setup has been verified, we can make the switch.
 
 Cheers,
 Max
 
 [0] https://blogs.apache.org/infra/entry/git_based_websites_available
 [1]
 
 https://issues.apache.org/jira/servicedesk/agent/INFRA/issue/INFRA-9560
 [2] https://git-wip-us.apache.org/repos/asf?p=flink-web.git;a=summary

[jira] [Created] (FLINK-1937) Cannot create SparseVector with only one non-zero element.

2015-04-23 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-1937:
--

 Summary: Cannot create SparseVector with only one non-zero element.
 Key: FLINK-1937
 URL: https://issues.apache.org/jira/browse/FLINK-1937
 Project: Flink
  Issue Type: Bug
  Components: Machine Learning Library
Reporter: Chiwan Park


I tried creating SparseVector with only one non-zero element. But I couldn't 
create it. Following code causes the problem.

{code}
val vec2 = SparseVector.fromCOO(3, (1, 1))
{code}

I got a compile error following:

{code:none}
Error:(60, 29) overloaded method value fromCOO with alternatives:
  (size: Int,entries: Iterable[(Int, 
Double)])org.apache.flink.ml.math.SparseVector and
  (size: Int,entries: (Int, Double)*)org.apache.flink.ml.math.SparseVector
 cannot be applied to (Int, (Int, Int))
val vec2 = SparseVector.fromCOO(3, (1, 1))
^
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1933) Add distance measure interface and basic implementation to machine learning library

2015-04-22 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-1933:
--

 Summary: Add distance measure interface and basic implementation 
to machine learning library
 Key: FLINK-1933
 URL: https://issues.apache.org/jira/browse/FLINK-1933
 Project: Flink
  Issue Type: New Feature
  Components: Machine Learning Library
Reporter: Chiwan Park
Assignee: Chiwan Park


Add distance measure interface to calculate distance between two vectors and 
some implementations of the interface. In FLINK-1745, [~till.rohrmann] suggests 
a interface following:

{code}
trait DistanceMeasure {
  def distance(a: Vector, b: Vector): Double
}
{code}

I think that following list of implementation is sufficient to provide first to 
ML library users.

* Manhattan distance [1]
* Cosine distance [2]
* Euclidean distance (and Squared) [3]
* Tanimoto distance [4]
* Minkowski distance [5]
* Chebyshev distance [6]

[1]: http://en.wikipedia.org/wiki/Taxicab_geometry
[2]: http://en.wikipedia.org/wiki/Cosine_similarity
[3]: http://en.wikipedia.org/wiki/Euclidean_distance
[4]: 
http://en.wikipedia.org/wiki/Jaccard_index#Tanimoto_coefficient_.28extended_Jaccard_coefficient.29
[5]: http://en.wikipedia.org/wiki/Minkowski_distance
[6]: http://en.wikipedia.org/wiki/Chebyshev_distance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [DISCUSS] Add a Beta badge in the documentation to components in flink-staging

2015-03-29 Thread Chiwan Park

+1

Good idea. Users can accept API changes of “flink-staging” module with “Beta 
badge.

Regards.
Chiwan Park (Sent with iPhone)


 On Mar 29, 2015, at 11:38 PM, Robert Metzger rmetz...@apache.org wrote:
 
 Hi,
 
 In an offline discussion with other Flink committers, we came up with the
 idea to mark new components from the flink-staging module with a Beta
 badge in the documentation.
 This way, we make it very clear that the component is still under heavy
 development.
 
 If we agree on this, I'll file a JIRA and add the badge to the
 documentation.
 
 
 Best,
 Robert

Re: Subscription to mailing list

2015-03-21 Thread Chiwan Park

Hi,

You can subscribe to the mailing list by sending a email to 
dev-subscr...@flink.apache.org mailto:dev-subscr...@flink.apache.org.
Other mailing lists about Flink are in 
http://flink.apache.org/community.html#mailing-lists 
http://flink.apache.org/community.html#mailing-lists

Regards.
Chiwan Park (Sent with iPhone)



 On Mar 22, 2015, at 3:46 AM, Devesh Gade deveshgade152...@gmail.com wrote:
 
 Hi,
 
 I would like to subscribe to the Apache Flink developer mailing list.
 
 Regards,
 Devesh Gade.
 
 -- 
 Tough times dont last,Tough People Do.

Re: Subscription to mailing list

2015-03-21 Thread Chiwan Park

Hi,

You can subscribe to the mailing list by sending a email to 
dev-subscr...@flink.apache.org mailto:dev-subscr...@flink.apache.org.
Other mailing lists about Flink are in 
http://flink.apache.org/community.html#mailing-lists 
http://flink.apache.org/community.html#mailing-lists.

Regards.
Chiwan Park (Sent with iPhone)

P.S. Sorry for resending email because I dropped Devesh’s email address.


 On Mar 22, 2015, at 3:46 AM, Devesh Gade deveshgade152...@gmail.com wrote:
 
 Hi,
 
 I would like to subscribe to the Apache Flink developer mailing list.
 
 Regards,
 Devesh Gade.
 
 -- 
 Tough times dont last,Tough People Do.

How to test including ITCase using maven?

2015-03-18 Thread Chiwan Park

 org.apache.flink.test.distributedCache.DistributedCacheTest
Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.554 sec - in 
org.apache.flink.api.scala.types.TypeInformationGenTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.649 sec - in 
org.apache.flink.test.compiler.examples.RelationalQueryCompilerTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.553 sec - in 
org.apache.flink.test.compiler.iterations.ConnectedComponentsTest
Running org.apache.flink.test.misc.GenericTypeInfoTest
Running 
org.apache.flink.test.recordJobs.relational.query1Util.LineItemFilterTest
Running org.apache.flink.test.recordJobTests.CollectionSourceTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.332 sec - in 
org.apache.flink.api.scala.ScalaAPICompletenessTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.311 sec - in 
org.apache.flink.test.compiler.iterations.MultipleJoinsWithSolutionSetCompilerTest
Running org.apache.flink.test.recordJobTests.CollectionValidationTest
Running org.apache.flink.test.testPrograms.util.tests.IntTupleDataInFormatTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.571 sec - in 
org.apache.flink.api.scala.runtime.TupleSerializerTest
Running org.apache.flink.test.testPrograms.util.tests.TupleTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.722 sec - in 
org.apache.flink.test.compiler.plandump.PreviewPlanDumpTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.907 sec - in 
org.apache.flink.test.misc.GenericTypeInfoTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.681 sec - in 
org.apache.flink.test.recordJobTests.CollectionValidationTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.333 sec - in 
org.apache.flink.test.compiler.iterations.PageRankCompilerTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.543 sec - in 
org.apache.flink.test.testPrograms.util.tests.IntTupleDataInFormatTest
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.278 sec - in 
org.apache.flink.test.testPrograms.util.tests.TupleTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.241 sec - in 
org.apache.flink.test.recordJobs.relational.query1Util.LineItemFilterTest
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.651 sec - in 
org.apache.flink.test.compiler.plandump.DumpCompiledPlanTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.544 sec - in 
org.apache.flink.test.recordJobTests.CollectionSourceTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.006 sec - in 
org.apache.flink.test.distributedCache.DistributedCacheTest

Results :

Tests run: 367, Failures: 0, Errors: 0, Skipped: 1

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 32.988 s
[INFO] Finished at: 2015-03-18T19:15:30+09:00
[INFO] Final Memory: 58M/1039M
[INFO] 


As you can see, There is no test log with test case named ~ITCase. I think that 
something is wrong. How can I test all including ~ITCase using maven?

Regards.
Chiwan Park (Sent with iPhone)

Re: How to test including ITCase using maven?

2015-03-18 Thread Chiwan Park

Thanks @Andra, @Stephan.
I will try it.

Regards.
Chiwan Park (Sent with iPhone)


 On Mar 18, 2015, at 7:33 PM, Andra Lungu lungu.an...@gmail.com wrote:
 
 The way I do it is mvn -e test :)
 
 On Wed, Mar 18, 2015 at 11:21 AM, Chiwan Park chiwanp...@icloud.com wrote:
 
 Hello.
 I have a question about test using maven.
 
 I tested with `mvn -pl flink-tests test` command to test flink-tests
 module. I got followed execution logs. (I removed some unnecessary logs.)
 
 [INFO] Scanning for projects...
 [INFO]
 [INFO]
 
 [INFO] Building flink-tests 0.9-SNAPSHOT
 [INFO]
 
 
 ---
 T E S T S
 ---
 
 ---
 T E S T S
 ---
 Running
 org.apache.flink.api.scala.operators.translation.AggregateTranslationTest
 Running org.apache.flink.api.scala.operators.JoinOperatorTest
 Running org.apache.flink.api.scala.operators.AggregateOperatorTest
 Running org.apache.flink.api.scala.operators.CoGroupOperatorTest
 Running
 org.apache.flink.api.scala.compiler.PartitionOperatorTranslationTest
 Running org.apache.flink.api.scala.operators.GroupingTest
 Running org.apache.flink.api.scala.operators.FirstNOperatorTest
 Running
 org.apache.flink.api.scala.functions.SemanticPropertiesTranslationTest
 Running org.apache.flink.api.scala.DeltaIterationSanityCheckTest
 Running org.apache.flink.api.scala.io.CsvInputFormatTest
 Running org.apache.flink.api.scala.io.CollectionInputFormatTest
 Running org.apache.flink.api.scala.operators.DistinctOperatorTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.021 sec
 - in org.apache.flink.api.scala.operators.AggregateOperatorTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.006 sec
 - in org.apache.flink.api.scala.operators.FirstNOperatorTest
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.069 sec
 - in
 org.apache.flink.api.scala.operators.translation.AggregateTranslationTest
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.306 sec
 - in org.apache.flink.api.scala.functions.SemanticPropertiesTranslationTest
 Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.33 sec
 - in org.apache.flink.api.scala.operators.DistinctOperatorTest
 Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.46 sec
 - in org.apache.flink.api.scala.operators.GroupingTest
 Running
 org.apache.flink.api.scala.operators.translation.CoGroupCustomPartitioningTest
 Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.583 sec
 - in org.apache.flink.api.scala.io.CsvInputFormatTest
 Running
 org.apache.flink.api.scala.operators.translation.CoGroupGroupSortTranslationTest
 Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.76 sec
 - in org.apache.flink.api.scala.DeltaIterationSanityCheckTest
 Running
 org.apache.flink.api.scala.operators.translation.CustomPartitioningGroupingKeySelectorTest
 Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.882 sec
 - in org.apache.flink.api.scala.operators.JoinOperatorTest
 Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.907 sec
 - in org.apache.flink.api.scala.operators.CoGroupOperatorTest
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.041 sec
 - in org.apache.flink.api.scala.compiler.PartitionOperatorTranslationTest
 Running
 org.apache.flink.api.scala.operators.translation.CustomPartitioningGroupingPojoTest
 Running
 org.apache.flink.api.scala.operators.translation.CustomPartitioningGroupingTupleTest
 Running
 org.apache.flink.api.scala.operators.translation.CustomPartitioningTest
 Running
 org.apache.flink.api.scala.operators.translation.DeltaIterationTranslationTest
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.453 sec
 - in org.apache.flink.api.scala.io.CollectionInputFormatTest
 Running
 org.apache.flink.api.scala.operators.translation.DistinctTranslationTest
 Running
 org.apache.flink.api.scala.operators.translation.JoinCustomPartitioningTest
 Running
 org.apache.flink.api.scala.operators.translation.ReduceTranslationTest
 Running org.apache.flink.api.scala.runtime.CaseClassComparatorTest
 Running org.apache.flink.api.scala.runtime.GenericPairComparatorTest
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.146 sec
 - in org.apache.flink.api.scala.runtime.GenericPairComparatorTest
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 2.736 sec
 - in
 org.apache.flink.api.scala.operators.translation.CoGroupGroupSortTranslationTest
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.126 sec
 - in
 org.apache.flink.api.scala.operators.translation.DistinctTranslationTest
 Tests run: 6, Failures: 0

Re: Website documentation minor bug

2015-03-10 Thread Chiwan Park

Looks good! +1 for the new one.

Regards.
Chiwan Park (Sent with iPhone)


 On Mar 10, 2015, at 7:28 PM, Hermann Gábor reckone...@gmail.com wrote:
 
 Looks nice, +1 for the new one.
 
 On Tue, Mar 10, 2015 at 11:24 AM Maximilian Michels m...@apache.org wrote:
 
 Seems like my smart data crawling web mail took the linked images out.
 So here we go again:
 
 New
 http://i.imgur.com/KK7fhiR.png
 
 Old
 http://i.imgur.com/kP2LPnY.png
 
 On Tue, Mar 10, 2015 at 11:17 AM, Stephan Ewen se...@apache.org wrote:
 Looks the same to me ;-)
 
 The mailing lists do not support attachments...
 
 On Tue, Mar 10, 2015 at 11:15 AM, Maximilian Michels m...@apache.org
 wrote:
 
 So here are the proposed changes.
 
 New
 
 
 Old
 
 
 
 
 If there are no objections, I will merge this by the end of the day.
 
 Best regards,
 Max
 
 On Mon, Mar 9, 2015 at 4:22 PM, Hermann Gábor reckone...@gmail.com
 wrote:
 
 Thanks Gyula, that helps a lot :D
 
 Nice solution. Thank you Max!
 I also support the reduced header size!
 
 Cheers,
 Gabor
 
 On Mon, Mar 9, 2015 at 3:36 PM Márton Balassi 
 balassi.mar...@gmail.com
 wrote:
 
 +1 for the proposed solution from Max
 +1 for decreasing the size: but let's have preview, I also think
 that
 the
 current one is a bit too large
 
 On Mon, Mar 9, 2015 at 2:16 PM, Maximilian Michels m...@apache.org
 wrote:
 
 We can fix this for the headings by adding the following CSS rule:
 
 h1, h2, h3, h4 {
padding-top: 100px;
margin-top: -100px;
 }
 
 In the course of changing this, we could also reduce the size of
 the
 navigation header in the docs. It is occupies too much space and
 doesn't have a lot of functionality. I'd suggest to half its size.
 The
 positioning at the top is fine for me.
 
 
 Kind regards,
 Max
 
 On Mon, Mar 9, 2015 at 2:08 PM, Hermann Gábor 
 reckone...@gmail.com
 wrote:
 I think the navigation looks nice this way.
 
 It's rather a small CSS/HTML problem that the header shades the
 title
 when
 clicking on an anchor link.
 (It's that the content starts at top, but there is the header
 covering
 it.)
 
 I'm not much into web stuff, but I would gladly fix it.
 
 Can someone help me with this?
 
 On Sun, Mar 8, 2015 at 9:52 PM Stephan Ewen se...@apache.org
 wrote:
 
 I agree, it is not optimal.
 
 What would be a better way to do this? Have the main navigation
 (currently
 on the left) at the top, and the per-page navigation on the
 side?
 
 Do you want to take a stab at this?
 
 On Sun, Mar 8, 2015 at 7:08 PM, Hermann Gábor 
 reckone...@gmail.com
 
 wrote:
 
 Hey,
 
 Currently following an anchor link (e.g. #transformations
 
 http://ci.apache.org/projects/flink/flink-docs-master/
 programming_guide.html#transformations
 )
 results in the header occupying the top of the page, thus the
 title
 and
 some of the first lines cannot be seen. This is not a big
 deal,
 but
 it's
 user-facing and a bit irritating.
 
 Can someone fix it, please?
 
 (I tried it on Firefox and Chromium on Ubuntu 14.10)
 
 Cheers,
 Gabor

[jira] [Created] (FLINK-1654) Wrong scala example of POJO type in documentation

2015-03-04 Thread Chiwan Park (JIRA)

Chiwan Park created FLINK-1654:
--

 Summary: Wrong scala example of POJO type in documentation
 Key: FLINK-1654
 URL: https://issues.apache.org/jira/browse/FLINK-1654
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9
Reporter: Chiwan Park
Priority: Trivial


In 
[documentation|https://github.com/chiwanpark/flink/blob/master/docs/programming_guide.md#pojos],
 there is a scala example of POJO

{code}
class WordWithCount(val word: String, val count: Int) {
  def this() {
this(null, -1)
  }
}
{code}

I think that this is wrong because Flink POJO required public fields or private 
fields with getter and setter. Fields in scala class is private in default. We 
should change the field declarations to use `var` keyword or class declaration 
to case class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Access flink-conf.yaml data

2015-03-02 Thread Chiwan Park

I think that you can use `org.apache.flink.configuration.GlobalConfiguration` 
to obtain configuration object.

Regards.
Chiwan Park (Sent with iPhone)


 On Mar 3, 2015, at 12:17 PM, Dulaj Viduranga vidura...@icloud.com wrote:
 
 Hi,
 Can someone help me on how to access the flink-conf.yaml configuration values 
 inside the flink sources? Are these readily available as a map somewhere?
 
 Thanks.

1 2 >

1 - 100 of 101 matches

Mail list logo