Re: Monitoring single-run job statistics

2016-01-06 Thread Filip Łęczycki
Hi Stephan,

Thank you for you answer. I would love to contribute but currently I have
no capacity as I am buried with my thesis.

I will reach out after graduating :)

Bestr regards.
Filip

Pozdrawiam,
Filip Łęczycki

2016-01-05 10:35 GMT+01:00 Stephan Ewen <se...@apache.org>:

> Hi Filip!
>
> There are thoughts and efforts to extend Flink to push the result
> statistics of Flink jobs to the YARN timeline server. That way, you can
> explore jobs that are completed.
>
> Since the whole web dashboard in Flink has a pure REST design, this is a
> quite straightforward fix.
>
> From the capacities I see in the community, I can not promise that to be
> fixed immediately. Let me know, though, if you are interested in
> contributing an addition there, and I can walk you through the steps that
> would be needed.
>
> Greetings,
> Stephan
>
>
> On Mon, Jan 4, 2016 at 9:17 PM, Filip Łęczycki <filipleczy...@gmail.com>
> wrote:
>
>> Hi Till,
>>
>> Thank you for you answer however I am sorry to hear that. I was reluctant
>> to execute jobs with long running Flink cluster due to the fact that
>> multiple jobs would cloud yarn statistics regarding cpu and memory time as
>> well as Flink's garbage collector statistics in log, as they would be
>> stored for the whole Flink cluster, instead of a single job.
>>
>> Do you know whether is there a way to extract mentioned stats (cpu time,
>> mem time, gc time) for a single job ran on long running Flink cluster?
>>
>> I will be very grateful for an answer:)
>>
>> Best regards,
>> Filip
>>
>> Pozdrawiam,
>> Filip Łęczycki
>>
>> 2016-01-04 10:05 GMT+01:00 Till Rohrmann <till.rohrm...@gmail.com>:
>>
>>> Hi Filip,
>>>
>>> at the moment it is not possible to retrieve the job statistics after
>>> the job has finished with flink run -m yarn-cluster. The reason is that
>>> the YARN cluster is only alive as long as the job is executed. Thus, I
>>> would recommend you to execute your jobs with a long running Flink cluster
>>> on YARN.
>>>
>>> Cheers,
>>> Till
>>> ​
>>>
>>> On Fri, Jan 1, 2016 at 11:29 PM, Filip Łęczycki <filipleczy...@gmail.com
>>> > wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am running filnk aps on YARN cluster and I am trying to get some
>>>> benchmarks. When I start a long-running flink cluster on my YARN cluster I
>>>> have an access to web UI and rest API that provide me statistics of the
>>>> deployed jobs (as desribed here:
>>>> https://ci.apache.org/projects/flink/flink-docs-master/internals/monitoring_rest_api.html).
>>>> I was wondering is this possible to get such information about a single run
>>>> job trigerred with 'flink run -m yarn-cluster ...'? After the job is
>>>> finished there is no flink client running so I cannot use rest api to get
>>>> stats.
>>>>
>>>> Thanks for any help:)
>>>>
>>>>
>>>> Best regards/Pozdrawiam,
>>>> Filip Łęczycki
>>>>
>>>
>>>
>>
>


Re: Monitoring single-run job statistics

2016-01-04 Thread Filip Łęczycki
Hi Till,

Thank you for you answer however I am sorry to hear that. I was reluctant
to execute jobs with long running Flink cluster due to the fact that
multiple jobs would cloud yarn statistics regarding cpu and memory time as
well as Flink's garbage collector statistics in log, as they would be
stored for the whole Flink cluster, instead of a single job.

Do you know whether is there a way to extract mentioned stats (cpu time,
mem time, gc time) for a single job ran on long running Flink cluster?

I will be very grateful for an answer:)

Best regards,
Filip

Pozdrawiam,
Filip Łęczycki

2016-01-04 10:05 GMT+01:00 Till Rohrmann <till.rohrm...@gmail.com>:

> Hi Filip,
>
> at the moment it is not possible to retrieve the job statistics after the
> job has finished with flink run -m yarn-cluster. The reason is that the
> YARN cluster is only alive as long as the job is executed. Thus, I would
> recommend you to execute your jobs with a long running Flink cluster on
> YARN.
>
> Cheers,
> Till
> ​
>
> On Fri, Jan 1, 2016 at 11:29 PM, Filip Łęczycki <filipleczy...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I am running filnk aps on YARN cluster and I am trying to get some
>> benchmarks. When I start a long-running flink cluster on my YARN cluster I
>> have an access to web UI and rest API that provide me statistics of the
>> deployed jobs (as desribed here:
>> https://ci.apache.org/projects/flink/flink-docs-master/internals/monitoring_rest_api.html).
>> I was wondering is this possible to get such information about a single run
>> job trigerred with 'flink run -m yarn-cluster ...'? After the job is
>> finished there is no flink client running so I cannot use rest api to get
>> stats.
>>
>> Thanks for any help:)
>>
>>
>> Best regards/Pozdrawiam,
>> Filip Łęczycki
>>
>
>


Monitoring single-run job statistics

2016-01-01 Thread Filip Łęczycki
Hi all,

I am running filnk aps on YARN cluster and I am trying to get some
benchmarks. When I start a long-running flink cluster on my YARN cluster I
have an access to web UI and rest API that provide me statistics of the
deployed jobs (as desribed here:
https://ci.apache.org/projects/flink/flink-docs-master/internals/monitoring_rest_api.html).
I was wondering is this possible to get such information about a single run
job trigerred with 'flink run -m yarn-cluster ...'? After the job is
finished there is no flink client running so I cannot use rest api to get
stats.

Thanks for any help:)


Best regards/Pozdrawiam,
Filip Łęczycki


Re: Job Statistics

2015-06-18 Thread Maximilian Michels
Hi Jean,

I think it would be a nice to have feature to display some metrics on the
command line after a job has completed. We already have the run time and
the accumulator results available at the CLI and printing those would be
easy. What metrics in particular are you looking for?

Best,
Max

On Thu, Jun 18, 2015 at 3:41 PM, Jean Bez jeanluca...@gmail.com wrote:

 Hi Fabian,

 I am trying to compare some examples on Hadoop, Spark and Flink. If
 possible I would like to see the job statistics like the report given by
 Hadoop. Since I am running these examples on a large cluster it would be
 much better if I could obtain such data directly from the console.

 Thanks!
 Jean
 Em 18/06/2015 04:55, Fabian Hueske fhue...@gmail.com escreveu:

 Hi Jean,

 what kind of job execution stats are you interested in?

 Cheers, Fabian

 2015-06-18 9:01 GMT+02:00 Matthias J. Sax mj...@informatik.hu-berlin.de
 :

 Hi,

 the CLI cannot show any job statistics. However, you can use the
 JobManager web interface that is accessible at port 8081 from a browser.

 -Matthias


 On 06/17/2015 10:13 PM, Jean Bez wrote:
  Hello,
 
  Is it possible to view job statistics after it finished to execute
  directly in the command line? If so, could you please explain how? I
  could not find any mentions about this in the docs. I also tried to set
  the logs to debug mode, but no other information was presented.
 
  Thank you!
 
  Regards,
  Jean





Re: Job Statistics

2015-06-18 Thread Jean Bez
Hi Fabian,

I am trying to compare some examples on Hadoop, Spark and Flink. If
possible I would like to see the job statistics like the report given by
Hadoop. Since I am running these examples on a large cluster it would be
much better if I could obtain such data directly from the console.

Thanks!
Jean
Em 18/06/2015 04:55, Fabian Hueske fhue...@gmail.com escreveu:

 Hi Jean,

 what kind of job execution stats are you interested in?

 Cheers, Fabian

 2015-06-18 9:01 GMT+02:00 Matthias J. Sax mj...@informatik.hu-berlin.de:

 Hi,

 the CLI cannot show any job statistics. However, you can use the
 JobManager web interface that is accessible at port 8081 from a browser.

 -Matthias


 On 06/17/2015 10:13 PM, Jean Bez wrote:
  Hello,
 
  Is it possible to view job statistics after it finished to execute
  directly in the command line? If so, could you please explain how? I
  could not find any mentions about this in the docs. I also tried to set
  the logs to debug mode, but no other information was presented.
 
  Thank you!
 
  Regards,
  Jean





Re: Job Statistics

2015-06-18 Thread Jean Bez
Hi Maximilian,

The metrics am interested in are I/O, run time and communication. Could you
please provide an example of how to obtain such results?

Thank you!!

2015-06-18 10:45 GMT-03:00 Maximilian Michels m...@apache.org:

 Hi Jean,

 I think it would be a nice to have feature to display some metrics on the
 command line after a job has completed. We already have the run time and
 the accumulator results available at the CLI and printing those would be
 easy. What metrics in particular are you looking for?

 Best,
 Max

 On Thu, Jun 18, 2015 at 3:41 PM, Jean Bez jeanluca...@gmail.com wrote:

 Hi Fabian,

 I am trying to compare some examples on Hadoop, Spark and Flink. If
 possible I would like to see the job statistics like the report given by
 Hadoop. Since I am running these examples on a large cluster it would be
 much better if I could obtain such data directly from the console.

 Thanks!
 Jean
 Em 18/06/2015 04:55, Fabian Hueske fhue...@gmail.com escreveu:

 Hi Jean,

 what kind of job execution stats are you interested in?

 Cheers, Fabian

 2015-06-18 9:01 GMT+02:00 Matthias J. Sax mj...@informatik.hu-berlin.de
 :

 Hi,

 the CLI cannot show any job statistics. However, you can use the
 JobManager web interface that is accessible at port 8081 from a browser.

 -Matthias


 On 06/17/2015 10:13 PM, Jean Bez wrote:
  Hello,
 
  Is it possible to view job statistics after it finished to execute
  directly in the command line? If so, could you please explain how? I
  could not find any mentions about this in the docs. I also tried to
 set
  the logs to debug mode, but no other information was presented.
 
  Thank you!
 
  Regards,
  Jean






Re: Job Statistics

2015-06-18 Thread Jean Bez
Hello Max,

I will try to do that! Do you know if I could obtain data about the I/O and
communication as well? From what I could understand I can get the runtime
and the accumulator results only. Is that right?

2015-06-18 11:37 GMT-03:00 Maximilian Michels m...@apache.org:

 Hi Jean,

 As I said, there is currently only the run time available. You can print
 the run time and accumulators results to std out by retrieving the
 JobExecutionResult from the ExecutionEnvironment:

 JobExecutionResult result = env.execute();
 System.out.println(runtime:  result.getNetRuntime());
 for (Map.EntryString, Object entry :
 result.getAllAccumulatorResults().entrySet()) {
 System.out.println(entry.getKey() + :  entry.getValue());
 }

 You would do that in your Flink program. You could also store metrics in
 the accumulators. However, since you're trying to compare different systems
 I'd advise you to use some external tools for monitoring resource usage
 like Ganglia or collectd.

 Best,
 Max

 On Thu, Jun 18, 2015 at 4:11 PM, Jean Bez jeanluca...@gmail.com wrote:

 Hi Maximilian,

 The metrics am interested in are I/O, run time and communication. Could
 you please provide an example of how to obtain such results?

 Thank you!!

 2015-06-18 10:45 GMT-03:00 Maximilian Michels m...@apache.org:

 Hi Jean,

 I think it would be a nice to have feature to display some metrics on
 the command line after a job has completed. We already have the run time
 and the accumulator results available at the CLI and printing those would
 be easy. What metrics in particular are you looking for?

 Best,
 Max

 On Thu, Jun 18, 2015 at 3:41 PM, Jean Bez jeanluca...@gmail.com wrote:

 Hi Fabian,

 I am trying to compare some examples on Hadoop, Spark and Flink. If
 possible I would like to see the job statistics like the report given by
 Hadoop. Since I am running these examples on a large cluster it would be
 much better if I could obtain such data directly from the console.

 Thanks!
 Jean
 Em 18/06/2015 04:55, Fabian Hueske fhue...@gmail.com escreveu:

 Hi Jean,

 what kind of job execution stats are you interested in?

 Cheers, Fabian

 2015-06-18 9:01 GMT+02:00 Matthias J. Sax 
 mj...@informatik.hu-berlin.de:

 Hi,

 the CLI cannot show any job statistics. However, you can use the
 JobManager web interface that is accessible at port 8081 from a
 browser.

 -Matthias


 On 06/17/2015 10:13 PM, Jean Bez wrote:
  Hello,
 
  Is it possible to view job statistics after it finished to execute
  directly in the command line? If so, could you please explain how? I
  could not find any mentions about this in the docs. I also tried to
 set
  the logs to debug mode, but no other information was presented.
 
  Thank you!
 
  Regards,
  Jean








Re: Job Statistics

2015-06-18 Thread Jean Bez
Hi,

I tried to view directly from the web interface but I could not find any
other information about the completed jobs. I have the list, but when I
open it, no further information is provided. Is this correct?

2015-06-18 15:10 GMT-03:00 Jean Bez jeanluca...@gmail.com:

 Hello Max,

 I will try to do that! Do you know if I could obtain data about the I/O
 and communication as well? From what I could understand I can get the
 runtime and the accumulator results only. Is that right?

 2015-06-18 11:37 GMT-03:00 Maximilian Michels m...@apache.org:

 Hi Jean,

 As I said, there is currently only the run time available. You can print
 the run time and accumulators results to std out by retrieving the
 JobExecutionResult from the ExecutionEnvironment:

 JobExecutionResult result = env.execute();
 System.out.println(runtime:  result.getNetRuntime());
 for (Map.EntryString, Object entry :
 result.getAllAccumulatorResults().entrySet()) {
 System.out.println(entry.getKey() + :  entry.getValue());
 }

 You would do that in your Flink program. You could also store metrics in
 the accumulators. However, since you're trying to compare different systems
 I'd advise you to use some external tools for monitoring resource usage
 like Ganglia or collectd.

 Best,
 Max

 On Thu, Jun 18, 2015 at 4:11 PM, Jean Bez jeanluca...@gmail.com wrote:

 Hi Maximilian,

 The metrics am interested in are I/O, run time and communication. Could
 you please provide an example of how to obtain such results?

 Thank you!!

 2015-06-18 10:45 GMT-03:00 Maximilian Michels m...@apache.org:

 Hi Jean,

 I think it would be a nice to have feature to display some metrics on
 the command line after a job has completed. We already have the run time
 and the accumulator results available at the CLI and printing those would
 be easy. What metrics in particular are you looking for?

 Best,
 Max

 On Thu, Jun 18, 2015 at 3:41 PM, Jean Bez jeanluca...@gmail.com
 wrote:

 Hi Fabian,

 I am trying to compare some examples on Hadoop, Spark and Flink. If
 possible I would like to see the job statistics like the report given by
 Hadoop. Since I am running these examples on a large cluster it would be
 much better if I could obtain such data directly from the console.

 Thanks!
 Jean
 Em 18/06/2015 04:55, Fabian Hueske fhue...@gmail.com escreveu:

 Hi Jean,

 what kind of job execution stats are you interested in?

 Cheers, Fabian

 2015-06-18 9:01 GMT+02:00 Matthias J. Sax 
 mj...@informatik.hu-berlin.de:

 Hi,

 the CLI cannot show any job statistics. However, you can use the
 JobManager web interface that is accessible at port 8081 from a
 browser.

 -Matthias


 On 06/17/2015 10:13 PM, Jean Bez wrote:
  Hello,
 
  Is it possible to view job statistics after it finished to execute
  directly in the command line? If so, could you please explain how?
 I
  could not find any mentions about this in the docs. I also tried
 to set
  the logs to debug mode, but no other information was presented.
 
  Thank you!
 
  Regards,
  Jean









Re: Job Statistics

2015-06-18 Thread Stephan Ewen
Hi!

There are no I/O or record statistics collected at the moment. It is work
in progress. Also a new Web Frontend that visualizes those is in the works,
so this is going to improve soon, but for now, there is no easy way to grab
those numbers.

If you are interested in contributing, I could pull you into some of the
discussions about collecting and reporting metrics.

Greetings,
Stephan


On Thu, Jun 18, 2015 at 1:42 PM, Jean Bez jeanluca...@gmail.com wrote:

 Hi,

 I tried to view directly from the web interface but I could not find any
 other information about the completed jobs. I have the list, but when I
 open it, no further information is provided. Is this correct?

 2015-06-18 15:10 GMT-03:00 Jean Bez jeanluca...@gmail.com:

 Hello Max,

 I will try to do that! Do you know if I could obtain data about the I/O
 and communication as well? From what I could understand I can get the
 runtime and the accumulator results only. Is that right?

 2015-06-18 11:37 GMT-03:00 Maximilian Michels m...@apache.org:

 Hi Jean,

 As I said, there is currently only the run time available. You can print
 the run time and accumulators results to std out by retrieving the
 JobExecutionResult from the ExecutionEnvironment:

 JobExecutionResult result = env.execute();
 System.out.println(runtime:  result.getNetRuntime());
 for (Map.EntryString, Object entry :
 result.getAllAccumulatorResults().entrySet()) {
 System.out.println(entry.getKey() + :  entry.getValue());
 }

 You would do that in your Flink program. You could also store metrics in
 the accumulators. However, since you're trying to compare different systems
 I'd advise you to use some external tools for monitoring resource usage
 like Ganglia or collectd.

 Best,
 Max

 On Thu, Jun 18, 2015 at 4:11 PM, Jean Bez jeanluca...@gmail.com wrote:

 Hi Maximilian,

 The metrics am interested in are I/O, run time and communication. Could
 you please provide an example of how to obtain such results?

 Thank you!!

 2015-06-18 10:45 GMT-03:00 Maximilian Michels m...@apache.org:

 Hi Jean,

 I think it would be a nice to have feature to display some metrics on
 the command line after a job has completed. We already have the run time
 and the accumulator results available at the CLI and printing those would
 be easy. What metrics in particular are you looking for?

 Best,
 Max

 On Thu, Jun 18, 2015 at 3:41 PM, Jean Bez jeanluca...@gmail.com
 wrote:

 Hi Fabian,

 I am trying to compare some examples on Hadoop, Spark and Flink. If
 possible I would like to see the job statistics like the report given by
 Hadoop. Since I am running these examples on a large cluster it would be
 much better if I could obtain such data directly from the console.

 Thanks!
 Jean
 Em 18/06/2015 04:55, Fabian Hueske fhue...@gmail.com escreveu:

 Hi Jean,

 what kind of job execution stats are you interested in?

 Cheers, Fabian

 2015-06-18 9:01 GMT+02:00 Matthias J. Sax 
 mj...@informatik.hu-berlin.de:

 Hi,

 the CLI cannot show any job statistics. However, you can use the
 JobManager web interface that is accessible at port 8081 from a
 browser.

 -Matthias


 On 06/17/2015 10:13 PM, Jean Bez wrote:
  Hello,
 
  Is it possible to view job statistics after it finished to execute
  directly in the command line? If so, could you please explain
 how? I
  could not find any mentions about this in the docs. I also tried
 to set
  the logs to debug mode, but no other information was presented.
 
  Thank you!
 
  Regards,
  Jean










Re: Job Statistics

2015-06-18 Thread Matthias J. Sax
Hi,

the CLI cannot show any job statistics. However, you can use the
JobManager web interface that is accessible at port 8081 from a browser.

-Matthias


On 06/17/2015 10:13 PM, Jean Bez wrote:
 Hello,
 
 Is it possible to view job statistics after it finished to execute
 directly in the command line? If so, could you please explain how? I
 could not find any mentions about this in the docs. I also tried to set
 the logs to debug mode, but no other information was presented. 
 
 Thank you!
 
 Regards,
 Jean



signature.asc
Description: OpenPGP digital signature


Re: Job Statistics

2015-06-18 Thread Fabian Hueske
Hi Jean,

what kind of job execution stats are you interested in?

Cheers, Fabian

2015-06-18 9:01 GMT+02:00 Matthias J. Sax mj...@informatik.hu-berlin.de:

 Hi,

 the CLI cannot show any job statistics. However, you can use the
 JobManager web interface that is accessible at port 8081 from a browser.

 -Matthias


 On 06/17/2015 10:13 PM, Jean Bez wrote:
  Hello,
 
  Is it possible to view job statistics after it finished to execute
  directly in the command line? If so, could you please explain how? I
  could not find any mentions about this in the docs. I also tried to set
  the logs to debug mode, but no other information was presented.
 
  Thank you!
 
  Regards,
  Jean




Job Statistics

2015-06-17 Thread Jean Bez
Hello,

Is it possible to view job statistics after it finished to execute directly
in the command line? If so, could you please explain how? I could not find
any mentions about this in the docs. I also tried to set the logs to debug
mode, but no other information was presented.

Thank you!

Regards,
Jean