Re: taskmanager.heap.mb recommendation

2016-04-28 Thread Gyula Fóra
I agree that this might actually be pretty bad recommendation especially for streaming programs depending on the state backend. Gyula On Thu, Apr 28, 2016 at 11:18 PM Greg Hogan wrote: > Hi all, > > The Flink docs recommend "if the cluster is exclusively running Flink, the > total amount of avai

taskmanager.heap.mb recommendation

2016-04-28 Thread Greg Hogan
Hi all, The Flink docs recommend "if the cluster is exclusively running Flink, the total amount of available memory per machine minus some memory for the operating system (maybe 1-2 GB) is a good value" [1]. Would it be worthwhile to recommend a higher memory reserve and note that the file write

Re: Read JSON file as input

2016-04-28 Thread Punit Naik
I am so sorry. Please ignore my previous reply. Actually my input was too big so it hung. So stupid of me. Thanks a lot! Your example worked! On Fri, Apr 29, 2016 at 12:35 AM, Punit Naik wrote: > I tried exactly what you told me. But when I execute this code, first of > all it gives me a warning

Re: flatMap issue

2016-04-28 Thread Punit Naik
Yes you have. Thanks a lot Stefano Sir! On Thu, Apr 28, 2016 at 6:40 PM, Stefano Baghino < stefano.bagh...@radicalbit.io> wrote: > The behavior you described actually makes sense: by passing the identity > function (x => x) to flatMap, you're basically just flattening your data > set, and since i

Re: Read JSON file as input

2016-04-28 Thread Punit Naik
I tried exactly what you told me. But when I execute this code, first of all it gives me a warning saying "Type Any has no fields that are visible from Scala Type analysis. Falling back to Java Type Analysis (TypeExtractor)." in eclipse, and when I run it, the code just hangs and does not print a t

[jira] [Created] (FLINK-3851) Add interface to register external catalogs in the TableEnvironment

2016-04-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-3851: Summary: Add interface to register external catalogs in the TableEnvironment Key: FLINK-3851 URL: https://issues.apache.org/jira/browse/FLINK-3851 Project: Flink

Re: [DISCUSS] Graph algorithms for vertex and edge degree

2016-04-28 Thread Greg Hogan
Hi Fabian, It should be trivial to implement inDegrees and outDegrees using the new algorithms. We are now doing this for the translate methods in FLINK-3771. The algorithms implement additional customization but the Graph methods are kept simple. ScatterGatherIteration could be simplified as well

Re: RichMapPartitionFunction - problems with collect

2016-04-28 Thread Sergio Ramírez
Hello, OK, now I understand everything. So if I want to re-use my DataSet in several different operations, what should I do? Is there any way to maintain the save the data from re-computation? I am not only talking about iteration. I mean re-use of data. For example, imagine I want filter so

[jira] [Created] (FLINK-3850) Add forward field annotations to DataSet operators generated by the Table API

2016-04-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-3850: Summary: Add forward field annotations to DataSet operators generated by the Table API Key: FLINK-3850 URL: https://issues.apache.org/jira/browse/FLINK-3850 Project:

[jira] [Created] (FLINK-3849) Add FilterableTableSource interface and translation rule

2016-04-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-3849: Summary: Add FilterableTableSource interface and translation rule Key: FLINK-3849 URL: https://issues.apache.org/jira/browse/FLINK-3849 Project: Flink Issue

Re: Eclipse Problems

2016-04-28 Thread Matthias J. Sax
Hi Greg, Not sure if you followed the whole thread. The "main" problem discussed here is, that Eclipse does show an compile error for some tests that actually compile via maven. The tests uses a ParentClass and ChildClass and call fromElements(ParentClass.class, new ParentClass(), new ChildClass

[jira] [Created] (FLINK-3848) Add ProjectableTableSource interface and translation rule

2016-04-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-3848: Summary: Add ProjectableTableSource interface and translation rule Key: FLINK-3848 URL: https://issues.apache.org/jira/browse/FLINK-3848 Project: Flink Issue

[jira] [Created] (FLINK-3847) Reorganize package structure of flink-table

2016-04-28 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-3847: Summary: Reorganize package structure of flink-table Key: FLINK-3847 URL: https://issues.apache.org/jira/browse/FLINK-3847 Project: Flink Issue Type: Task

[jira] [Created] (FLINK-3846) Graph.removeEdges also removes duplicate edges

2016-04-28 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3846: - Summary: Graph.removeEdges also removes duplicate edges Key: FLINK-3846 URL: https://issues.apache.org/jira/browse/FLINK-3846 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-3845) Gelly allows duplicate vertices in Graph.addVertices

2016-04-28 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3845: - Summary: Gelly allows duplicate vertices in Graph.addVertices Key: FLINK-3845 URL: https://issues.apache.org/jira/browse/FLINK-3845 Project: Flink Issue Type: Bug

Re: Intellij code style

2016-04-28 Thread Theodore Vasiloudis
Do we plan to include something like this in the contribution guide as well? On Thu, Apr 28, 2016 at 3:16 PM, Stefano Baghino < stefano.bagh...@radicalbit.io> wrote: > Awesome Dawid! Thanks for taking the time to do this. :) > > On Thu, Apr 28, 2016 at 1:45 PM, Dawid Wysakowicz < > wysakowicz.da.

Re: Eclipse Problems

2016-04-28 Thread Greg Hogan
Matthias, Won't this be a compile-time error as long as the user is parameterizing the return type since .fromElements(OUT...) returns DataStreamSource and will bind to the nearest common superclass? The new .fromElements(Class, OUT...) does give the user the choice of common superclass. Greg O

Re: Read JSON file as input

2016-04-28 Thread Stefano Baghino
Hi Punit, what you want to do is something like this: val env = ExecutionEnvironment.getExecutionEnvironment env. readTextFile("path/to/test.json"). flatMap(line => JSON.parseFull(line)). print The JSON.parseFull function in the Scala standard library takes a string (a

[jira] [Created] (FLINK-3844) Checkpoint failures should not always lead to job failures

2016-04-28 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-3844: - Summary: Checkpoint failures should not always lead to job failures Key: FLINK-3844 URL: https://issues.apache.org/jira/browse/FLINK-3844 Project: Flink Issue Type

Re: [DISCUSS] Release Flink 1.0.3

2016-04-28 Thread Ufuk Celebi
Thanks to everyone who participated in the discussion. I'm now going to trigger the first RC. Aljoscha also merged another commit into release-1.0 for the RocksDB state backend. – Ufuk On Wed, Apr 27, 2016 at 4:10 PM, Fabian Hueske wrote: > Just pushed the fix. > > 2016-04-27 15:16 GMT+02:00 Fa

Re: flatMap issue

2016-04-28 Thread Stefano Baghino
The behavior you described actually makes sense: by passing the identity function (x => x) to flatMap, you're basically just flattening your data set, and since in Scala strings are also a collection of characters, you are presented with a collection of characters. If you just one to do something o

Re: Intellij code style

2016-04-28 Thread Stefano Baghino
Awesome Dawid! Thanks for taking the time to do this. :) On Thu, Apr 28, 2016 at 1:45 PM, Dawid Wysakowicz < wysakowicz.da...@gmail.com> wrote: > Hi, > > I tried to create a code style that would follow Flink code-style. It may > be not "production" ready, but I think it can be a good start. > Ho

Re: Eclipse Problems

2016-04-28 Thread Matthias J. Sax
Maybe. As we cannot change the interface anyway and is does what the user expects (even if only "by accident") I would just leave it as is... And for "mixed" types, specifying the common base class is acceptable "overhead" for the user IMHO. Nevertheless, the Eclipse problem is still not solved.

Re: Intellij code style

2016-04-28 Thread Dawid Wysakowicz
Hi, I tried to create a code style that would follow Flink code-style. It may be not "production" ready, but I think it can be a good start. Hope it will be useful for someone. Also I will be glad for any comments on that. 2016-04-10 13:59 GMT+02:00 Stephan Ewen : > I don't know how close Phoeni

[jira] [Created] (FLINK-3843) StreamFaultToleranceTestBase hangs in case of test failure.

2016-04-28 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3843: - Summary: StreamFaultToleranceTestBase hangs in case of test failure. Key: FLINK-3843 URL: https://issues.apache.org/jira/browse/FLINK-3843 Project: Flink

Re: Read JSON file as input

2016-04-28 Thread Punit Naik
I had one more request though. I have been struggling with JSONs and Flink for the past two days since I started using it. I have a JSON file which has one JSON object per line and I want to read it and store it as maps in another flink Dataset. In my JSON the values might be anything, for e.g. int

Re: Read JSON file as input

2016-04-28 Thread Punit Naik
I managed to fix this error. I basically had to do val j=data.map { x => ( x.replaceAll("\"","\\\"")) } instead of val j=data.map { x => ("\"\"\""+x+ "\"\"\"") } On Wed, Apr 27, 2016 at 4:05 PM, Punit Naik wrote: > I have my Apache Flink program: > > import org.apache.flink.api.scala._import sca

Re: flatMap issue

2016-04-28 Thread Punit Naik
No Sir, its one json per line. On Thu, Apr 28, 2016 at 3:19 PM, Fabian Hueske wrote: > readTextFile reads a file line-wise. > > Is it possible, that your first line only contains "{"? > > 2016-04-28 8:06 GMT+02:00 Punit Naik : > > > I have a test file which has a json per line. When I do a flatM

Re: Data locality and scheduler

2016-04-28 Thread Fabian Hueske
Hi, yes, that can cause network traffic. AFAIK, there are no plans to work on behavior. Best, Fabian 2016-04-26 18:17 GMT+02:00 CPC : > Hi > > But isnt this behaviour can cause a lot of network activity? Is there any > roadmap or plan to change this behaviour? > On Apr 26, 2016 7:06 PM, "Fabian

Re: flatMap issue

2016-04-28 Thread Fabian Hueske
readTextFile reads a file line-wise. Is it possible, that your first line only contains "{"? 2016-04-28 8:06 GMT+02:00 Punit Naik : > I have a test file which has a json per line. When I do a flatMap on it, it > automatically splits the whole json line on every character. Why does this > happen?

[jira] [Created] (FLINK-3842) Fix handling null record/row in generated code

2016-04-28 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-3842: --- Summary: Fix handling null record/row in generated code Key: FLINK-3842 URL: https://issues.apache.org/jira/browse/FLINK-3842 Project: Flink Issue Type

Re: Eclipse Problems

2016-04-28 Thread Till Rohrmann
I don't know whether it is that difficult to find out the nearest common super type. At least the Java compiler does the same when calling the pure var arg method. I think, we should be able to do something similar. Also throwing out the pure var arg implementation is not ideal in my opinion, becau