Monitoring REST API

2016-12-21 Thread Lydia Ickler
Hi all, I have a question regarding the Monitoring REST API; I want to analyze the behavior of my program with regards to I/O MiB/s, Network MiB/s and CPU % as the authors of this paper did. (https://hal.inria.fr/hal-01347638v2/document ) From the

Monitoring Flink on Yarn

2016-12-19 Thread Lydia Ickler
Hi all, I am using Flink 1.1.3 on Yarn and I wanted to ask how I can save the monitoring logs, e.g. for I/O or network, to HDFS or local FS? Since Yarn closes the Flink session after finishing the job I can't access the log via REST API. I am looking forward to your answer! Best regards,

multiple k-means in parallel

2016-11-27 Thread Lydia Ickler
Hi, I want to run k-means with different k in parallel. So each worker should calculate its own k-means. Is that possible? If I do a map on a list of integers to then apply k-means I get the following error: Task not serializable I am looking forward to your answers! Lydia

Write matrix/vector

2016-05-29 Thread Lydia Ickler
Hi, I would like to know how to write a Matrix or Vector (Dense/Sparse) to file? Thanks in advance! Best regards, Lydia

sparse matrix

2016-05-29 Thread Lydia Ickler
Hi all, I have two questions regarding sparse matrices: 1. I have a sparse Matrix: val sparseMatrix = SparseMatrix.fromCOO(row, col, csvInput.collect()) and now I would like to extract all values that are in a specific row X. How would I tackle that? flatMap() and filter() do not seem to be

Re: Scatter-Gather Iteration aggregators

2016-05-13 Thread Lydia Ickler
value will be available in the next > superstep. You can retrieve it calling the getPreviousIterationAggregate() > method. > Let me know if that clears things up! > > -Vasia. > > On 13 May 2016 at 08:57, Lydia Ickler <ickle...@googlemail.com > <mailto:ickle...@go

Re: Scatter-Gather Iteration aggregators

2016-05-13 Thread Lydia Ickler
016 um 08:04 schrieb Vasiliki Kalavri <vasilikikala...@gmail.com>: > > Hi Lydia, > > registered aggregators through the ScatterGatherConfiguration are accessible > both in the VertexUpdateFunction and in the MessageFunction. > > Cheers, > -Vasia. > > On 12

normalize vertex values

2016-05-12 Thread Lydia Ickler
Hi all, If I have a Graph g: Graph g and I would like to normalize all vertex values by the absolute max of all vertex values -> what API function would I choose? Thanks in advance! Lydia

Find differences

2016-04-07 Thread Lydia Ickler
Hi, If i have 2 DataSets A and B of Type Tuple3 how would I get a subset of A (based on the fields (0,1)) that does not occur in B? Is there maybe an already implemented method? Best regards, Lydia Von meinem iPhone gesendet

varying results: local VS cluster

2016-04-04 Thread Lydia Ickler
Hi all, I have an issue regarding execution on 1 machine VS 5 machines. If I execute the following code the results are not the same though I would expect them to be since the input file is the same. Do you have any suggestions? Thanks in advance! Lydia ExecutionEnvironment env =

Re: wait until BulkIteration finishes

2016-04-01 Thread Lydia Ickler
i Lydia, >> >> all downstream operators which depend on the bulk iteration will wait >> implicitly until data from the iteration operator is available. >> >> Cheers, >> Till >> >> On Thu, Mar 31, 2016 at 9:39 AM, Lydia Ickler <ickle...@googlemail.com >> <mailto:ickle...@googlemail.com>> wrote: >> Hi all, >> >> is there a way to tell the program that it should wait until the >> BulkIteration finishes before the rest of the program is executed? >> >> Best regards, >> Lydia >> > > >

Re: wait until BulkIteration finishes

2016-03-31 Thread Lydia Ickler
> Hi Lydia, > > all downstream operators which depend on the bulk iteration will wait > implicitly until data from the iteration operator is available. > > Cheers, > Till > > On Thu, Mar 31, 2016 at 9:39 AM, Lydia Ickler <ickle...@googlemail.com > <

BulkIteration and BroadcastVariables

2016-03-30 Thread Lydia Ickler
Hi all, I have a question regarding the BulkIteration and BroadcastVariables: The BulkIteration by default has one input variable and sends one variable into the next iteration, right? What if I need to collect some intermediate results in each iteration? How would I do that? For example in my

Re: normalizing DataSet with cross()

2016-03-22 Thread Lydia Ickler
rmalization result should change as > well, I would assume. > > > On Tue, Mar 22, 2016 at 3:15 PM, Lydia Ickler <ickle...@googlemail.com > <mailto:ickle...@googlemail.com>> wrote: > Hi Till, > > maybe it is doing so because I rewrite the ds in the next step

normalizing DataSet with cross()

2016-03-22 Thread Lydia Ickler
Hi all, I have a question. If I have a DataSet DataSet> ds and I want to normalize all values (at position 2) in it by the maximum of the DataSet (ds.aggregate(Aggregations.MAX, 2)). How do I tackle that? If I use the cross operator my result changes every

filter dataset

2016-02-29 Thread Lydia Ickler
Hi all, I have a DataSet and I want to apply a filter to only get back all entries with e.g. first Integer in tuple == 0. With a normal filter I do not have the possibility to pass an an additional argument but I have to set that parameter inside the filter

DistributedMatrix in Flink

2016-02-04 Thread Lydia Ickler
Hi all, as mentioned before I am trying to import the RowMatrix from Spark to Flink… In the code I already ran into a dead end… In the function multiplyGramianMatrixBy() (see end of mail) there is the line: rows.context.broadcast(v) (rows is a DataSet[Vector] What exactly is this line doing?

Re: cluster execution

2016-02-01 Thread Lydia Ickler
xD… a simple "hdfs dfs -chmod -R 777 /users" fixed it! > Am 01.02.2016 um 12:17 schrieb Till Rohrmann <trohrm...@apache.org>: > > Hi Lydia, > > I looks like that. I guess you should check your hdfs access rights. > > Cheers, > Till > > On Mon

cluster execution

2016-01-28 Thread Lydia Ickler
Hi all, I am doing some operations on a DataSet> … (see code below) When I run my program on a cluster with 3 machines I can see within the web client that only my master is executing the program. Do I have to specify somewhere that all machines have to

Re: MatrixMultiplication

2016-01-25 Thread Lydia Ickler
e 1000 times longer than the multiplication of the 100 x 100 matrix. Have > you waited so long to see whether it completes or is there another problem? > > Cheers, > Till > > On Mon, Jan 25, 2016 at 2:13 PM, Lydia Ickler <ickle...@googlemail.com > <mailto:ickle...@goo

rowmatrix equivalent

2016-01-24 Thread Lydia Ickler
Hi all, this is maybe a stupid question but what within Flink is the equivalent to Sparks’ RowMatrix ? Thanks in advance, Lydia

eigenvalue solver

2016-01-12 Thread Lydia Ickler
Hi, I wanted to know if there are any implementations yet within the Machine Learning Library or generally that can efficiently solve eigenvalue problems in Flink? Or if not do you have suggestions on how to approach a parallel execution maybe with BLAS or Breeze? Thanks in advance! Lydia

Re: eigenvalue solver

2016-01-12 Thread Lydia Ickler
no Eigenvalue solver included in FlinkML yet. But if you want to, > then you can give it a try :-) > > [1] http://www.cs.newpaltz.edu/~lik/publications/Ruixuan-Li-CCPE-2015.pdf > > Cheers, > Till > >> On Tue, Jan 12, 2016 at 9:47 AM, Lydia Ickler <ickle...@google

writeAsCsv

2015-10-07 Thread Lydia Ickler
Hi, stupid question: Why is this not saved to file? I want to transform an array to a DataSet but the Graph stops at collect(). //Transform Spectrum to DataSet List> dataList = new LinkedList>(); double[][] arr =

Re: writeAsCsv

2015-10-07 Thread Lydia Ickler
ok, thanks! :) I will try that! > Am 07.10.2015 um 21:35 schrieb Lydia Ickler <ickle...@googlemail.com>: > > Hi, > > stupid question: Why is this not saved to file? > I want to transform an array to a DataSet but the Graph stops at collect(). > > //Transf

source binary file

2015-10-06 Thread Lydia Ickler
Hi, how would I read a BinaryFile from HDFS with the Flink Java API? I can only find the Scala way… All the best, Lydia

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
> Am 02.10.2015 um 11:55 schrieb Lydia Ickler <ickle...@googlemail.com>: > > 0.10-SNAPSHOT

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
t; Are you relying on a feature only available in 0.10-SNAPSHOT? > Otherwise, I would recommend to use the latest stable release (0.9.1) for > your flink job and on the cluster. > > On Fri, Oct 2, 2015 at 11:55 AM, Lydia Ickler <ickle...@googlemail.com > <mailto:ickle...@googlemail.com

Re: data flow example on cluster

2015-10-02 Thread Lydia Ickler
d recommend to use the latest stable release (0.9.1) for > your flink job and on the cluster. > > On Fri, Oct 2, 2015 at 11:55 AM, Lydia Ickler <ickle...@googlemail.com > <mailto:ickle...@googlemail.com>> wrote: > Hi, > > but inside the pom of flunk-job is

DataSet transformation

2015-10-01 Thread Lydia Ickler
Hi all, so I have a case class Spectrum(mz: Float, intensity: Float) and a DataSet[Spectrum] to read my data in. Now I want to know if there is a smart way to transform my DataSet into a two dimensional Array ? Thanks in advance, Lydia

data flow example on cluster

2015-09-30 Thread Lydia Ickler
Hi all, I want to run the data-flow Wordcount example on a Flink Cluster. The local execution with „mvn exec:exec -Dinput=kinglear.txt -Doutput=wordcounts.txt“ is already working. How is the command to execute it on the cluster? Best regards, Lydia

Re: HBase issue

2015-09-24 Thread Lydia Ickler
roblem: >> https://issues.apache.org/jira/browse/HBASE-10304 >> >> In your log I see the same exception. Anyone has any idea what we could do >> about this? >> >> >>> On Tue, 22 Sep 2015 at 22:40 Lydia Ickler <ickle...@googlemail.com> wrote:

Re: HBase issue

2015-09-24 Thread Lydia Ickler
, 2015 at 11:16 AM, Aljoscha Krettek <aljos...@apache.org > <mailto:aljos...@apache.org>> wrote: > It might me that this is causing the problem: > https://issues.apache.org/jira/browse/HBASE-10304 > <https://issues.apache.org/jira/browse/HBASE-10304> > &

Re: no valid hadoop home directory can be found

2015-09-23 Thread Lydia Ickler
he JVM property > ‘hadoop.home.dir’ (-Dhadoop.home.dir=…) > > – Ufuk > >> On 23 Sep 2015, at 12:43, Lydia Ickler <ickle...@googlemail.com> wrote: >> >> Hi all, >> >> I get the following error message that no valid hadoop home directory can be >> found when t

no valid hadoop home directory can be found

2015-09-23 Thread Lydia Ickler
Hi all, I get the following error message that no valid hadoop home directory can be found when trying to initialize the HBase configuration. Where would I specify that path? 12:41:02,043 INFO org.apache.flink.addons.hbase.TableInputFormat - Initializing HBaseConfiguration

Job stuck at Assigning split to host...

2015-07-23 Thread Lydia Ickler
Hi, I am trying to read data from a HBase Table via the HBaseReadExample.java Unfortunately, my run gets always stuck at the same position. Do you guys have any suggestions? In the master node it says: 14:05:04,239 INFO org.apache.flink.runtime.jobmanager.JobManager - Received job

DataSet Conversion

2015-07-13 Thread Lydia Ickler
Hi guys, is it possible to convert a Java DataSet to a Scala Dataset? Right now I get the following error: Error:(102, 29) java: incompatible types: 'org.apache.flink.api.java.DataSetjava.lang.String cannot be converted to org.apache.flink.api.scala.DataSetjava.lang.String‘ Thanks in advance,

HBase Machine Learning

2015-07-11 Thread Lydia Ickler
Dear Sir or Madame, I would like to use the Flink-HBase addon to read out data that then serves as an input for the machine learning algorithms, respectively the SVM and MLR. Right now I first write the extracted data to a temporary file and then read it in via the libSVM method...but i guess