Re: flink loop

2015-02-05 Thread Vasiliki Kalavri
Hi, I'm not familiar with the particular algorithm, but you can most probably use one of the two iterate operators in Flink. You can read a description and see some examples in the documentation: http://flink.apache.org/docs/0.8/programming_guide.html#iteration-operators Let us know if you have

flink loop

2015-02-05 Thread tanguy racinet
Hi, We are trying to develop the Apriori algorith with the Flink for our Data minning project. In our understanding, Flink could handle loop within the workflow. However, our knowledge is limited and we cannot find a nice way to do it. Here is the flow of my algorithm : GenerateCandidates > C

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Stephan Ewen
Sounds good. In the course of this, we should probably extend the IOManager that it keeps track of temp files and deletes them when a task is done. On Thu, Feb 5, 2015 at 4:40 PM, Ufuk Celebi wrote: > After talking to Robert and Till offline, what about the following: > > - We add a shutdown hoo

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Ufuk Celebi
After talking to Robert and Till offline, what about the following: - We add a shutdown hook to the blob library cache manager to shutdown the blob service (just a delete call) - As Robert pointed out, we cannot do this with the IOManager paths right now, because they are essentially shared among

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Stephan Ewen
I think that process killing (HALT signal) is a very typical way in Linux to shut down processes. It is the most robust way, since it does not require to send any custom messages to the process. This is sort of graceful, as the JVM gets the signal and may do a lot of things before shutting down, s

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Till Rohrmann
Hmm this is not very gentleman-like to terminate the Job/TaskManagers. I'll check how the ActorSystem behaves in case of killing the process. Why can't we implement a more graceful termination mechanism? For example, we could send a termination message to the JobManager and TaskManagers. On Thu,

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Ufuk Celebi
Thank you very much, Robert! The problem is that the job/task manager shutdown methods are never called. When using the scripts, the task/job manager processes get killed and therefore shutdown methods are never called. @Till: Do you know whether there is a mechanism in Akka to register the actor

Re: Get 1 element of DataSet

2015-02-05 Thread Fabian Hueske
Have a look at "broadcast values/sets". Not sure if that solves you problem, but you could do: DataSet stringSet = ... DataSet first = stringSet.first(); DataSet myNewSet = stringSet.map(myMapFnc).withBroadcastSet(first,"first value"); This distributes the first DataSet to all Mapper functions

Re: Get 1 element of DataSet

2015-02-05 Thread Vinh June
Hi Stefan, DataSet.first(n) produces a child DataSet, while I need the element Specifically, I have a CSV with header line and I want to make the maps of each (header,value) pair for each line -- View this message in context: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.

Re: Get 1 element of DataSet

2015-02-05 Thread Stefan Bunk
Hi Vinh, have a look at the first function: http://flink.apache.org/docs/0.8/dataset_transformations.html#first-n Stefan On 5 February 2015 at 15:14, Vinh June wrote: > Hi, > > Is there any way to get 1 element of a DataSet, for example: > > val stringS

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Till Rohrmann
Hi Robert, thanks for the info. If the TaskManager/JobManager does not shutdown properly, i.e. killing of the process, then it is indeed the case that the BlobManager cannot properly remove all stored files. I don't know if this was lately the case for you. Furthermore, the files are not directly

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Robert Waury
I talked with the admins. The problem seemed to have been that the disk was full and Flink couldn't create the directory. Maybe the the error message should reflect if that is the cause. While cleaning up the disk we noticed that a lot of temporary blobStore files were not deleted by Flink after

Get 1 element of DataSet

2015-02-05 Thread Vinh June
Hi, Is there any way to get 1 element of a DataSet, for example: val stringSet: DataSet[String] = ... val str: String = stringSet.getFunction() str -- View this message in context: http://apache-flink-incubator-u

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Ufuk Celebi
On Thu, Feb 5, 2015 at 11:23 AM, Robert Waury wrote: > Hi, > > I can reproduce the error on my cluster. > > Unfortunately I can't check whether the parent directories were created on > the different nodes since I have no way of accessing them. I start all the > jobs from a gateway. > I've added

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Robert Waury
Hi, I can reproduce the error on my cluster. Unfortunately I can't check whether the parent directories were created on the different nodes since I have no way of accessing them. I start all the jobs from a gateway. Cheers, Robert On Thu, Feb 5, 2015 at 11:01 AM, Ufuk Celebi wrote: > Hey R

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Ufuk Celebi
Hey Robert, is this error reproducible? I've looked into the blob store and the error occurs when the blob cache tries to *create* a local file before requesting it from the job manager. I will add a check to the blob store to ensure that the parent directories have been created. Other than th

Re: Expressing `grep` with many search terms in Flink

2015-02-05 Thread Stephan Ewen
Concerning your question how to run the programs one after another: In the core method of the program, you can simply have a loop around the part between "getExecutionEnvironment()" and "env.execute()". That way, you trigger the programs one after another. On Wed, Feb 4, 2015 at 9:34 PM, Fabian

Re: Job fails with FileNotFoundException from blobStore

2015-02-05 Thread Robert Waury
I compiled from the release-0.8 branch. On Thu, Feb 5, 2015 at 8:55 AM, Stephan Ewen wrote: > Hey Robert! > > On which version are you? 0.8 or 0.9- SNAPSHOT? > Am 04.02.2015 14:49 schrieb "Robert Waury" : > > Hi, >> >> I'm suddenly getting FileNotFoundExceptions because the blobStore cannot >> f