[jira] [Created] (FLINK-4874) Enabling Flink web interface in local execution

2016-10-20 Thread Krishna Prasad Anna Ramesh Kumar (JIRA)
Krishna Prasad Anna Ramesh Kumar created FLINK-4874: --- Summary: Enabling Flink web interface in local execution Key: FLINK-4874 URL: https://issues.apache.org/jira/browse/FLINK-4874

[jira] [Created] (FLINK-4873) Add config option to specify "home directory" for YARN client resource sharing

2016-10-20 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-4873: - Summary: Add config option to specify "home directory" for YARN client resource sharing Key: FLINK-4873 URL: https://issues.apache.org/jira/browse/FLINK-4873 Project:

Re: [DISCUSS] Defining the Semantics of StreamingSQL

2016-10-20 Thread Tyler Akidau
On Thu, Oct 20, 2016 at 5:55 AM Fabian Hueske wrote: > Hi everybody, > > I cross posted the proposal also to the Apache Calcite dev mailing list to > collect some feedback from the community. > Tyler Akidau (Apache Beam committer) responded and commented on the > proposal. > >

Re: Removing flink-contrib/flink-operator-stats

2016-10-20 Thread Stephan Ewen
+1 for removing it - It seems quite unstable (is responsible for almost all build failures right now) - It is not integrated with the metric system. Having more metrics is desirable, but is a separate effort and needs a different approach. On Wed, Oct 19, 2016 at 4:23 PM, Greg Hogan

[jira] [Created] (FLINK-4872) Type erasure problem exclusively on cluster execution

2016-10-20 Thread Martin Junghanns (JIRA)
Martin Junghanns created FLINK-4872: --- Summary: Type erasure problem exclusively on cluster execution Key: FLINK-4872 URL: https://issues.apache.org/jira/browse/FLINK-4872 Project: Flink

[jira] [Created] (FLINK-4871) Add memory calculation for TaskManagers and forward MetricRegistry

2016-10-20 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-4871: Summary: Add memory calculation for TaskManagers and forward MetricRegistry Key: FLINK-4871 URL: https://issues.apache.org/jira/browse/FLINK-4871 Project: Flink

Re: Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

2016-10-20 Thread Thomas FOURNIER
Yep I've done it: import org.apache.flink.api.scala._ I had reported this issue but still have the same problem. My code is the following (with imports) import org.apache.flink.api.scala._ import org.apache.flink.ml._ import org.apache.flink.ml.classification.SVM import

TopSpeedWindowing - in error: Could not forward element to next operator

2016-10-20 Thread Ovidiu Cristian Marcu
Could you check the following issue on master? When running this example org.apache.flink.streaming.examples.windowing. TopSpeedWindowing With default configuration I have no errors. When I change the state backend with RocksDB I receive this error: java.lang.RuntimeException: Could not

[jira] [Created] (FLINK-4870) ContinuousFileMonitoringFunction does not properly handle absolut Windows paths

2016-10-20 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-4870: --- Summary: ContinuousFileMonitoringFunction does not properly handle absolut Windows paths Key: FLINK-4870 URL: https://issues.apache.org/jira/browse/FLINK-4870

[jira] [Created] (FLINK-4869) Store record pointer after record keys

2016-10-20 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-4869: - Summary: Store record pointer after record keys Key: FLINK-4869 URL: https://issues.apache.org/jira/browse/FLINK-4869 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-4868) Insertion sort could avoid the swaps

2016-10-20 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-4868: -- Summary: Insertion sort could avoid the swaps Key: FLINK-4868 URL: https://issues.apache.org/jira/browse/FLINK-4868 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-4867) Investigate code generation for improving sort performance

2016-10-20 Thread Gabor Gevay (JIRA)
Gabor Gevay created FLINK-4867: -- Summary: Investigate code generation for improving sort performance Key: FLINK-4867 URL: https://issues.apache.org/jira/browse/FLINK-4867 Project: Flink Issue

Re: Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

2016-10-20 Thread Theodore Vasiloudis
This has to do with not doing a wildcard import of the Scala api, it was reported and already fixed on master [1] [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/jira-Created-FLINK-4792-Update-documentation-QuickStart-FlinkML-td13936.html -- Sent from a mobile device. May

[jira] [Created] (FLINK-4866) Make Trigger.clear() Abstract to Enforce Implementation

2016-10-20 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-4866: --- Summary: Make Trigger.clear() Abstract to Enforce Implementation Key: FLINK-4866 URL: https://issues.apache.org/jira/browse/FLINK-4866 Project: Flink

Re: FlinkML - Evaluate function should manage LabeledVector

2016-10-20 Thread Thomas FOURNIER
Done here: FLINK-4865 2016-10-20 14:07 GMT+02:00 Thomas FOURNIER : > Ok thanks. > > I'm going to create a specific JIRA on this. Ok ? > > 2016-10-20 12:54 GMT+02:00 Theodore Vasiloudis < >

[jira] [Created] (FLINK-4865) FlinkML - Add EvaluateDataSet operation for LabeledVector

2016-10-20 Thread Thomas FOURNIER (JIRA)
Thomas FOURNIER created FLINK-4865: -- Summary: FlinkML - Add EvaluateDataSet operation for LabeledVector Key: FLINK-4865 URL: https://issues.apache.org/jira/browse/FLINK-4865 Project: Flink

[jira] [Created] (FLINK-4864) Shade Calcite dependency in flink-table

2016-10-20 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-4864: Summary: Shade Calcite dependency in flink-table Key: FLINK-4864 URL: https://issues.apache.org/jira/browse/FLINK-4864 Project: Flink Issue Type:

Implicit class RichExecutionEnvironment - Can't use MlUtils.readLibSVM(path) in QUickStart guide

2016-10-20 Thread Thomas FOURNIER
Hello, Following QuickStart guide in FlinkML, I have to do the following: val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM(env, "src/main/resources/svmguide1") Instead of: val astroTrain:DataSet[LabeledVector] = MLUtils.readLibSVM( "src/main/resources/svmguide1") Nonetheless, this

Re: FlinkML - Evaluate function should manage LabeledVector

2016-10-20 Thread Theodore Vasiloudis
I think this might be problematic with the current way we define the predict operations because they require that both the Testing and PredictionValue types are available. Here's what I had to do to get it to work (in ml/pipeline/Predictor.scala): import org.apache.flink.ml.math.{Vector =>

Re: Add partitionedKeyBy to DataStream

2016-10-20 Thread Till Rohrmann
Hi Xiaowei, I like the idea to reuse a partitioning and thus saving a shuffle operation. It would be great if we could fail at runtime in case the partitioning changed somehow. That way a logical user failure won't go unnoticed. Would it make sense to name the method partitionedByKey(...)

Re: Efficient Batch Operator in Streaming

2016-10-20 Thread Till Rohrmann
Hi Xiaowei, thanks for sharing this proposal. How would fault tolerance work with the BatchFunction? Since the batch function seems to manage its own buffer, users would also have to make sure that in-flight elements which are buffered but not yet processed are checkpointed, wouldn't they?

Re: Efficient Batch Operator in Streaming

2016-10-20 Thread Chesnay Schepler
Could you not do the same thing today with a FlatMap function that stores incoming elements and only computes and collects a result when a certain threshold is reached? On 20.10.2016 09:50, Xiaowei Jiang wrote: Very often, it's more efficient to process a batch of records at once instead of

FLIP-6 and running many "small" jobs

2016-10-20 Thread Maciek Próchniak
Hi, we're looking at FLIP-6 and while it looks really great we started to wonder how it fits in our use case. We currently have around 20 processes but the idea is to have many more of them. Many of them are pretty "small" - them don't large sources, are stateless, mainly filtering data.

Add partitionedKeyBy to DataStream

2016-10-20 Thread Xiaowei Jiang
After we do any interesting operations (e.g. reduce) on KeyedStream, the result becomes DataStream. In a lot of cases, the output still has the same or compatible keys with the KeyedStream (logically). But to do further operations on these keys, we are forced to use keyby again. This works

Efficient Batch Operator in Streaming

2016-10-20 Thread Xiaowei Jiang
Very often, it's more efficient to process a batch of records at once instead of processing them one by one. We can use window to achieve this functionality. However, window will store all records in states, which can be costly. It's desirable to have an efficient implementation of batch operator.

[jira] [Created] (FLINK-4863) states of merging window and trigger are set to different TimeWindows on merge

2016-10-20 Thread Manu Zhang (JIRA)
Manu Zhang created FLINK-4863: - Summary: states of merging window and trigger are set to different TimeWindows on merge Key: FLINK-4863 URL: https://issues.apache.org/jira/browse/FLINK-4863 Project: