[jira] [Created] (FLINK-3657) Change access of DataSetUtils.countElements() to 'public'

2016-03-22 Thread Suneel Marthi (JIRA)
Suneel Marthi created FLINK-3657: Summary: Change access of DataSetUtils.countElements() to 'public' Key: FLINK-3657 URL: https://issues.apache.org/jira/browse/FLINK-3657 Project: Flink

Re: Behavior of lib directory shipping on YARN

2016-03-22 Thread Ufuk Celebi
On Tue, Mar 22, 2016 at 8:42 PM, Stefano Baghino wrote: > My feeling is that running a job on YARN should > end up in having more or less the same effect, regardless of the way the > job is run. +1 I think that the current behaviour is buggy. The resource

[jira] [Created] (FLINK-3656) Rework TableAPI tests

2016-03-22 Thread Vasia Kalavri (JIRA)
Vasia Kalavri created FLINK-3656: Summary: Rework TableAPI tests Key: FLINK-3656 URL: https://issues.apache.org/jira/browse/FLINK-3656 Project: Flink Issue Type: Improvement

Behavior of lib directory shipping on YARN

2016-03-22 Thread Stefano Baghino
Hello everybody, in the past few days me and my colleagues ran some tests with Flink on YARN and detected a possible inconsistent behavior in the way the contents of the flink/lib directory is shipped to the cluster when run on YARN, depending on the fact that the jobs are deployed individually

Re: RollingSink

2016-03-22 Thread Vijay Srinivasaraghavan
I have tried both log4j logger as well as System.out.println option but none of these worked.  >From what I have seen so far is the Filesystem streaming connector classes are >not packaged in the grand jar (flink-dist_2.10-1.1-SNAPSHOT.jar) that is >copied under /build-target/lib location as

[jira] [Created] (FLINK-3655) Allow comma-separated multiple directories to be specified for FileInputFormat

2016-03-22 Thread Gna Phetsarath (JIRA)
Gna Phetsarath created FLINK-3655: - Summary: Allow comma-separated multiple directories to be specified for FileInputFormat Key: FLINK-3655 URL: https://issues.apache.org/jira/browse/FLINK-3655

Re: Flink-1.0.0 JobManager is not running in Docker Container on AWS

2016-03-22 Thread Deepak Jha
Hi Maximilian, Thanks for the email and looking into the issue. I'm using Scala 2.11 so it sounds perfect to me... I will be more than happy to test it out. On Tue, Mar 22, 2016 at 2:48 AM, Maximilian Michels wrote: > Hi Deepak, > > We have looked further into this and have a

[jira] [Created] (FLINK-3654) Disable Write-Ahead-Log in RocksDB State

2016-03-22 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3654: --- Summary: Disable Write-Ahead-Log in RocksDB State Key: FLINK-3654 URL: https://issues.apache.org/jira/browse/FLINK-3654 Project: Flink Issue Type:

[jira] [Created] (FLINK-3653) recovery.zookeeper.storageDir is not documented on the configuration page

2016-03-22 Thread Stefano Baghino (JIRA)
Stefano Baghino created FLINK-3653: -- Summary: recovery.zookeeper.storageDir is not documented on the configuration page Key: FLINK-3653 URL: https://issues.apache.org/jira/browse/FLINK-3653 Project:

[jira] [Created] (FLINK-3652) Enable UnusedImports check for Scala checkstyle

2016-03-22 Thread Maximilian Michels (JIRA)
Maximilian Michels created FLINK-3652: - Summary: Enable UnusedImports check for Scala checkstyle Key: FLINK-3652 URL: https://issues.apache.org/jira/browse/FLINK-3652 Project: Flink

RichMapPartitionFunction - problems with collect

2016-03-22 Thread Sergio Ramírez
Hi all, I've been having some problems with RichMapPartitionFunction. Firstly, I tried to convert the iterable into an array unsuccessfully. Then, I have used some buffers to store the values per column. I am trying to transpose the local matrix of LabeledVectors that I have in each

[jira] [Created] (FLINK-3651) Fix faulty RollingSink Restore

2016-03-22 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3651: --- Summary: Fix faulty RollingSink Restore Key: FLINK-3651 URL: https://issues.apache.org/jira/browse/FLINK-3651 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-3650) Add maxBy/minBy to Scala DataSet API

2016-03-22 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3650: Summary: Add maxBy/minBy to Scala DataSet API Key: FLINK-3650 URL: https://issues.apache.org/jira/browse/FLINK-3650 Project: Flink Issue Type: Improvement

[jira] [Created] (FLINK-3649) Document stable API methods maxBy/minBy

2016-03-22 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3649: Summary: Document stable API methods maxBy/minBy Key: FLINK-3649 URL: https://issues.apache.org/jira/browse/FLINK-3649 Project: Flink Issue Type:

[jira] [Created] (FLINK-3648) Introduce Trigger Test Harness

2016-03-22 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3648: --- Summary: Introduce Trigger Test Harness Key: FLINK-3648 URL: https://issues.apache.org/jira/browse/FLINK-3648 Project: Flink Issue Type: Sub-task

[jira] [Created] (FLINK-3647) Change StreamSource to use Processing-Time Clock Service

2016-03-22 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3647: --- Summary: Change StreamSource to use Processing-Time Clock Service Key: FLINK-3647 URL: https://issues.apache.org/jira/browse/FLINK-3647 Project: Flink

[jira] [Created] (FLINK-3646) Use Processing-Time Clock in Window Assigners/Triggers

2016-03-22 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-3646: --- Summary: Use Processing-Time Clock in Window Assigners/Triggers Key: FLINK-3646 URL: https://issues.apache.org/jira/browse/FLINK-3646 Project: Flink

Re: RollingSink

2016-03-22 Thread Aljoscha Krettek
Hi, how are you printing the debug statements? But yeah all the logic of renaming in progress files and cleaning up after a failed job happens in restoreState(BucketState state). The steps are roughly these: 1. Move current in-progress file to final location 2. truncate the file if necessary

Re: Flink-1.0.0 JobManager is not running in Docker Container on AWS

2016-03-22 Thread Maximilian Michels
Hi Deepak, We have looked further into this and have a pretty easy fix. However, it will only work with Flink's Scala 2.11 version because newer versions of the Akka library are incompatible with Scala 2.10 (Flink's default Scala version). Would that be a viable option for you? We're currently

RollingSink

2016-03-22 Thread Vijay Srinivasaraghavan
Hello, I have enabled checkpoint and I am using RollingSink to sink the data to HDFS (2.7.x) from KafkaConsumer. To simulate failover/recovery, I stopped TaskManager and the job gets rescheduled to other Taskmanager instance. During this momemnt, the current "in-progress" gets closed and

Re: [DISCUSS] Improving Trigger/Window API and Semantics

2016-03-22 Thread Aljoscha Krettek
Hi, I have some thoughts about Evictors as well yes, but I didn’t yet write them down. The basic idea about them is this: class Evictor { Predicate getPredicate(Iterable elements, int size, W window); } class Predicate { boolean evict(StreamRecord element); } The evictor

[jira] [Created] (FLINK-3645) HDFSCopyUtilitiesTest fails in a Hadoop cluster

2016-03-22 Thread Chiwan Park (JIRA)
Chiwan Park created FLINK-3645: -- Summary: HDFSCopyUtilitiesTest fails in a Hadoop cluster Key: FLINK-3645 URL: https://issues.apache.org/jira/browse/FLINK-3645 Project: Flink Issue Type: Bug

Re: [DISCUSS] Improving Trigger/Window API and Semantics

2016-03-22 Thread Fabian Hueske
Thanks for the write-up Aljoscha. I think it is a really good idea to separate the different aspects (fire, purging, lateness) a bit. At the moment, all of these need to be handled in the Trigger and a custom trigger is necessary whenever, you want some of these aspects slightly differently