[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2016-07-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369713#comment-15369713
 ] 

Hudson commented on YARN-4074:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10074/])
YARN-4074. [timeline reader] implement support for querying for flows (sjlee: 
rev 10fa6da7d8a6013698767c6136ae20f0e04415e9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowRunRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/TimelineEntityReaderFactory.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/GenericEntityReader.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowActivity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/TimelineEntityType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowRun.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/apptoflow/AppToFlowRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/FlowActivityEntity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/FlowEntity.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/TimelineEntityReader.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowScanner.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorage.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/TestTimelineServiceClientIntegration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/application/ApplicationRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/common/BaseTable.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestFlowDataGenerator.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/FlowActivityRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/collector/TimelineCollectorWebService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/HBaseTimelineReaderImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/api/records/timelineservice/TestTimelineServiceRecords.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/FlowRunEntityReader.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/FlowActivityEntityReader.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/ApplicationEntityReader.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/storage/entity/EntityRowKey.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/timelineservice/FlowRunEntity.java


> [timeline reader] implement support for querying for flows and flow runs
> 

[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-22 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903246#comment-14903246
 ] 

Vrushali C commented on YARN-4074:
--

Chatted with Li offline and decided to file 
https://issues.apache.org/jira/browse/YARN-4200 to deal with the refactoring of 
package names and proceed with this patch. 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-22 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903284#comment-14903284
 ] 

Li Lu commented on YARN-4074:
-

Sure, please go ahead with the current patch. Thanks for the work folks! 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-22 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903158#comment-14903158
 ] 

Li Lu commented on YARN-4074:
-

Sorry I missed your message yesterday... I was thinking about putting those 
hbase reader classes (like ApplicationEntityReader) to a sub dir to indicate 
they only work with HBase. It's also fine to commit the patch as-is if that's 
troublesome. I'm OK with both. 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-22 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903411#comment-14903411
 ] 

Vrushali C commented on YARN-4074:
--

Committed patch v8. Thanks [~sjlee0] for the contribution and everyone for the 
review! 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-21 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901019#comment-14901019
 ] 

Vrushali C commented on YARN-4074:
--

Thanks everyone for the review, I will commit this patch in today. 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-21 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901067#comment-14901067
 ] 

Li Lu commented on YARN-4074:
-

Hi [~sjlee0] [~vrushalic], thanks for the work and sorry I could not get back 
earlier. Overall the patch LGTM. I like the refactor here and it's almost a 
must to put it in soon. One nit is, on naming and code organization, we're 
putting all derived readers in the storage package, but inevitably associating 
them with our (specific) HBase storage. If it's quick and easy, maybe we can 
put them in a package inside storage? If I'm missing anything here and it's 
hard, let proceed with this patch. Your call. 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-21 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901299#comment-14901299
 ] 

Vrushali C commented on YARN-4074:
--

Hi [~gtCarrera9]

To confirm my understanding, did you mean putting all reader classes into a 
package like org.apache.hadoop.yarn.server.timelineservice.storage.reader ? 

There is a org.apache.hadoop.yarn.server.timelineservice.reader but that is for 
the web services related code. 
thanks
Vrushali

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876931#comment-14876931
 ] 

Varun Saxena commented on YARN-4074:


LGTM.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-18 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14876185#comment-14876185
 ] 

Vrushali C commented on YARN-4074:
--


Patch v8 looks good too me. Thanks for updating the test cases to use the 
reader objects. 
+1

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14875864#comment-14875864
 ] 

Sangjin Lee commented on YARN-4074:
---

I would greatly appreciate your review. Thanks!

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-17 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803384#comment-14803384
 ] 

Sangjin Lee commented on YARN-4074:
---

{quote}
In TimelineEntityReader#readMetrics it seems safe to assume that if we have 
more than one value that this is a TimelineMetric.Type.TIME_SERIES.
Conversely it doesn't have to be true though right? I guess we'll just assume 
that for timelines we'd never have just one value? I can't quite oversee the 
impact of incorrectly assuming TimelineMetric.Type.SINGLE_VALUE if only one 
value has been written to HBase yet.
{quote}

That's right. We discussed this some time ago, and we think it'd be safer if 
the metric type (single value vs. time series) were stored/persisted. But there 
are other dimensions of metrics we may need to store (e.g. long vs. float, 
whether to aggregate, etc.). Also, there is a question of what if users wrote 
inconsistent data. So, at that time we went with a simple decision that's 
currently there (the code you see in {{TimelineEntityReader}} is refactored out 
of {{HBaseTimelineReaderImpl}} so it's not new code).

We should come to a conclusion on how to store/encode various dimensions of 
metrics, but not as part of this JIRA.

{quote}
Wrt. ApplicationRowKey: at some point (perhaps not this jira) we should 
consider making the app_id a compound object that is stored with a ? separator. 
The prefix (in most cases in yarn right now would be "application_") would be 
separate and the RM start time and the final numeric part would be stored as a 
numerical value with a separate Bytes.to... conversion.

Otherwise we'll end up getting incorrect order for rowkeys when the application 
id wraps to 10K and each power of ten after that. For example, lexically 
application_1442351767756_1 < application_1442351767756_

If we just access the application by specific key this doesn't matter, but if 
we do a row-scan and count on ordering to set an appropriate stop on the scan, 
we'll break things.
This happens on all rowkeys with the app_id in it.
{quote}

That's a good point. We need to fix this, or we'll have incorrect 
orders/results happening with queries. This impacts anywhere we rely on the app 
id order (as string). I'll file a separate JIRA to address this issue.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.POC.001.patch, YARN-4074-YARN-2928.POC.002.patch, 
> YARN-4074-YARN-2928.POC.003.patch, YARN-4074-YARN-2928.POC.004.patch, 
> YARN-4074-YARN-2928.POC.005.patch, YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804328#comment-14804328
 ] 

Hadoop QA commented on YARN-4074:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  29m 14s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 6 new or modified test files. |
| {color:green}+1{color} | javac |  11m 49s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  14m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 13s | The applied patch generated  1 
new checkstyle issues (total was 31, now 32). |
| {color:green}+1{color} | whitespace |   0m 35s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 50s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 49s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   5m 39s | The patch appears to introduce 1 
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   2m 13s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   2m  8s | Tests passed in 
hadoop-yarn-server-tests. |
| {color:green}+1{color} | yarn tests |   2m 32s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  75m 28s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-yarn-api |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12757127/YARN-4074-YARN-2928.007.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 4b37985 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/9194/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| Findbugs warnings | 
https://builds.apache.org/job/PreCommit-YARN-Build/9194/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-api.html
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9194/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9194/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-tests test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9194/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9194/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9194/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9194/console |


This message was automatically generated.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.POC.001.patch, YARN-4074-YARN-2928.POC.002.patch, 
> YARN-4074-YARN-2928.POC.003.patch, YARN-4074-YARN-2928.POC.004.patch, 
> YARN-4074-YARN-2928.POC.005.patch, YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14804597#comment-14804597
 ] 

Hadoop QA commented on YARN-4074:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 56s | Findbugs (version ) appears to 
be broken on YARN-2928. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 6 new or modified test files. |
| {color:green}+1{color} | javac |   7m 59s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  9s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 25s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m  5s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m 32s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 41s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 46s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 23s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |   1m 59s | Tests passed in 
hadoop-yarn-common. |
| {color:green}+1{color} | yarn tests |   2m 29s | Tests passed in 
hadoop-yarn-server-tests. |
| {color:green}+1{color} | yarn tests |   2m 46s | Tests passed in 
hadoop-yarn-server-timelineservice. |
| | |  53m 56s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12757157/YARN-4074-YARN-2928.008.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | YARN-2928 / 4b37985 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9199/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9199/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-tests test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9199/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt
 |
| hadoop-yarn-server-timelineservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/9199/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/9199/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/9199/console |


This message was automatically generated.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.008.patch, YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-17 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14803340#comment-14803340
 ] 

Sangjin Lee commented on YARN-4074:
---

That's correct. In other words, those are used to do {{getEntity()}}. That's 
why I said "a reader for *single-entity reads*" (plural "reads"), as opposed to 
"a reader for a single entity read".

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.007.patch, 
> YARN-4074-YARN-2928.POC.001.patch, YARN-4074-YARN-2928.POC.002.patch, 
> YARN-4074-YARN-2928.POC.003.patch, YARN-4074-YARN-2928.POC.004.patch, 
> YARN-4074-YARN-2928.POC.005.patch, YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790830#comment-14790830
 ] 

Varun Saxena commented on YARN-4074:


The patch looks fine to me. I tested some parts related to flow as well.

I see that in flow activity table, the row key is cluster id followed by an 
inverted timestamp. This I guess is to retrieve entities by a certain time 
range. I havent added this in REST related JIRA and dont see support even here. 
Will handle it after PoC I guess. Correct ?
Also I had added user as an optional query param in REST API code. I think 
querying by user wont really be a good idea looking at the row key. Will remove 
it.

The major code added in this patch is about the different readers based on 
table and a factory. It looks fine. However, number of parameters to the 
methods are quite a lot just like Reader API. As you mentioned elsewhere, maybe 
I can refactor this later. We should club together some things into logical 
things like context, filters, etc. That will reduce number of params.

In the factory class, we have a sequence of if-else statements. Although its a 
matter of perspective, sequence of if-else look a little inelegant. But we may 
not have too many great options here. Thought of enums i.e. having create 
methods with implementation tied to each enum but entity type enum is not HBase 
specific. Any other option ? I guess if-else should be fine for now because not 
too many tables should be added in future, if any.

In xxxEntityReader classes, maybe createTable should be renamed to getTable 
because we are not really creating any table here. We are just getting/creating 
a table object. 
Also if I am not wrong, no need to create this object again and again as well. 
All it really holds is static information such as table name and conf.


> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14790948#comment-14790948
 ] 

Sangjin Lee commented on YARN-4074:
---

Thanks for your comments [~varun_saxena]!

bq. This I guess is to retrieve entities by a certain time range. I havent 
added this in REST related JIRA and dont see support even here. Will handle it 
after PoC I guess. Correct ?

The flow activity table is a time-based set of data. The timestamp (day marker 
really) is there to order the activity in time. It is feasible to query the 
flow activity table based on time (e.g. "give me all the activity in the past 3 
days"). I didn't get around to it, but it should be pretty straightforward to 
support that after the POC. I'll file a JIRA for adding that support.

bq. Also I had added user as an optional query param in REST API code. I think 
querying by user wont really be a good idea looking at the row key. Will remove 
it.

Yes, the way the data is laid out, cluster + user will not be an efficient 
query, as time is the component that gets in before the user.

{quote}
The major code added in this patch is about the different readers based on 
table and a factory. It looks fine. However, number of parameters to the 
methods are quite a lot just like Reader API. As you mentioned elsewhere, maybe 
I can refactor this later. We should club together some things into logical 
things like context, filters, etc. That will reduce number of params.
{quote}

That is spot on. I really didn't like having to repeat the long list of 
arguments. But since you're looking into a better way of capturing the filters 
and predicates, I'm not really changing things as part of this JIRA. Hope that 
is consistent with your understanding.

{quote}
In the factory class, we have a sequence of if-else statements. Although its a 
matter of perspective, sequence of if-else look a little inelegant. But we may 
not have too many great options here. Thought of enums i.e. having create 
methods with implementation tied to each enum but entity type enum is not HBase 
specific. Any other option ? I guess if-else should be fine for now because not 
too many tables should be added in future, if any.
{quote}

I agree. I wanted to use the switch-case statements, but the main issue was 
that the input is string, not enums. If it were enums, it could have been 
trivial...

{quote}
In xxxEntityReader classes, maybe createTable should be renamed to getTable 
because we are not really creating any table here. We are just getting/creating 
a table object. 
Also if I am not wrong, no need to create this object again and again as well. 
All it really holds is static information such as table name and conf.
{quote}

Those are good suggestions. Yes, the {{BaseTable}} instances are thread-safe, 
and I think they can be reused. I'll update the patch to make those changes.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791493#comment-14791493
 ] 

Joep Rottinghuis commented on YARN-4074:


Wrt. javadoc comment and method names: "Instantiates a reader for single-entity 
reads." refers to the # of entities returned in each query, and not the number 
of times that reader can be used to issue a read right?

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-16 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14791504#comment-14791504
 ] 

Joep Rottinghuis commented on YARN-4074:


In TimelineEntityReader#readMetrics it seems safe to assume that if we have 
more than one value that this is a TimelineMetric.Type.TIME_SERIES.
Conversely it doesn't have to be true though right? I guess we'll just assume 
that for timelines we'd never have just one value? I can't quite oversee the 
impact of incorrectly assuming TimelineMetric.Type.SINGLE_VALUE if only one 
value has been written to HBase yet.

Wrt. ApplicationRowKey: at some point (perhaps not this jira) we should 
consider making the app_id a compound object that is stored with a ? separator. 
The prefix (in most cases in yarn right now would be "application_") would be 
separate and the RM start time and the final numeric part would be stored as a 
numerical value with a separate Bytes.to... conversion.

Otherwise we'll end up getting incorrect order for rowkeys when the application 
id wraps to 10K and each power of ten after that. For example, lexically 
application_1442351767756_1 < application_1442351767756_

If we just access the application by specific key this doesn't matter, but if 
we do a row-scan and count on ordering to set an appropriate stop on the scan, 
we'll break things.
This happens on all rowkeys with the app_id in it.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch, 
> YARN-4074-YARN-2928.POC.006.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-15 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745672#comment-14745672
 ] 

Varun Saxena commented on YARN-4074:


Thanks [~sjlee0] for updating the patch. Will have a look at it.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch, 
> YARN-4074-YARN-2928.POC.004.patch, YARN-4074-YARN-2928.POC.005.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-08 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735545#comment-14735545
 ] 

Sangjin Lee commented on YARN-4074:
---

It'd be great if you could take a look at the latest patch and let me know your 
feedback. Thanks!

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch, YARN-4074-YARN-2928.POC.003.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-03 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729556#comment-14729556
 ] 

Sangjin Lee commented on YARN-4074:
---

Just to be clear, the current POC patch already handles the null case, and I'm 
going to update it to check for negative values. Is that reasonable?

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729515#comment-14729515
 ] 

Varun Saxena commented on YARN-4074:


The 2nd point I guess even I can handle even in YARN-4075. I can verify limit 
and if its 0 or negative, forward null to storage layer. If its null, 
DEFAULT_LIMIT will be applied.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729558#comment-14729558
 ] 

Varun Saxena commented on YARN-4074:


[~sjlee0], yeah it handles null case. I meant we can handle negative values 
even at the REST layer(as part of YARN-4075) i.e. if the limit is negative I 
can forward null to storage layer which would mean default limit being applied. 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729560#comment-14729560
 ] 

Varun Saxena commented on YARN-4074:


Its fine though if you are handling negatives as part of YARN-4074.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727402#comment-14727402
 ] 

Varun Saxena commented on YARN-4074:


Few more comments.
* {{Scan#setMaxResultSize}} only limits the number of rows fetched from server 
to client in a single call. If more rows are available, they are still fetched 
when {{ResultScanner#next}} is invoked. This leads to more entities than the 
limit being returned. setMaxResultSize works similar to JDBCs' 
ResultSet#setFetchSize
So to apply limits in getFlowActivityEntities, we need to have a check for 
limit in for loop as well in conjunction to using setMaxResultSize.

* How do we handle the case of limit being 0 or negative ? In FS based impl, I 
had changed limit to DEFAULT_LIMIT in both the cases. Do the same here ?

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725914#comment-14725914
 ] 

Varun Saxena commented on YARN-4074:


Few comments :

# As TreeSet has been used to sort FlowActivityEntity objects in 
{{HbaseTimelineReaderImpl#getFlowActivityEntities}}, we should either set the 
fields used in {{TimelineEntity#compareTo}} or provide a compareTo method in 
FlowActivityEntity class itself. Otherwise I guess entities need to be sorted 
by date. In which case latter needs to be done.

# In parameterized constructor, {{FlowActivityEntity(String cluster, long time, 
String user, String flowName)}}  better to pass the type into superclass 
constructor ? type can be useful in JSON response ?

# Things like configs have no meaning for flows. So things which have no 
meaning in FlowEntity and FlowActivityEntity should we explicitly set them to 
null ?No need to send them in JSON(even if empty) I guess. 


> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14725918#comment-14725918
 ] 

Varun Saxena commented on YARN-4074:


Correction to point 1 :
As TreeSet has been used to sort FlowActivityEntity objects in 
HbaseTimelineReaderImpl#getFlowActivityEntities, we should either set the 
fields used in TimelineEntity#compareTo or provide a compareTo method in 
FlowActivityEntity class itself. Otherwise *it leads to NPE*. I guess entities 
need to be sorted by date. In which case latter needs to be done.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-01 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14724856#comment-14724856
 ] 

Varun Saxena commented on YARN-4074:


Agree...+1 to changing the name.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-09-01 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14726295#comment-14726295
 ] 

Sangjin Lee commented on YARN-4074:
---

I'm addressing [~varun_saxena]'s latest comments. Thanks for those.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723646#comment-14723646
 ] 

Varun Saxena commented on YARN-4074:


Moreover, do we return metrics at all times for a flow run ? Or make returning 
on metric field conditional ? Should be fine as it is a single flow run. Just 
confirming because then I would not need fields as a query parameter in REST.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723678#comment-14723678
 ] 

Vrushali C commented on YARN-4074:
--


bq. Should we support filtering on the basis of flow start time and probably 
end time as well ?

Going forward, yes, we do need to have a timerange parameter for queries. But 
for the PoC we work with what is being fetched/returned.

bq. Moreover, do we return metrics at all times for a flow run ? Or make 
returning on metric field conditional ? Just confirming because then I would 
not need fields as a query parameter in REST.

For the PoC we can return everything that is being fetched/ whatever is easier. 
But as such, we do need filtering out of metrics and returning subsets of 
metrics etc, so these query parameters would need to be worked out. We have to 
think about how we can allow for filtering of metrics, but we would anyways 
need a basic API that returns everything for a flow run, so I think for more 
enhancements can be added in later after the PoC, what do you think? 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723733#comment-14723733
 ] 

Varun Saxena commented on YARN-4074:


Ok...For PoC, this should be fine.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723773#comment-14723773
 ] 

Sangjin Lee commented on YARN-4074:
---

Just to clarify on the flow run "end time". Note that there is no formal 
definition of the "end time" of a flow run, in the absence of a formal flow API 
in YARN. The "end time" in the flow run is the latest end time of an 
application that is part of that flow run. Just so that we're clear on that 
definition.

On a related note, there is no formal definition of the state of a flow run 
(i.e. we cannot say with certainty whether a flow run is ended). The only 
definitive thing we can say about this is if a flow run is still running, which 
can be determined by having a running app.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723790#comment-14723790
 ] 

Sangjin Lee commented on YARN-4074:
---

Also, while we're at it, I find the name {{FlowEntity}} quite confusing as it 
really encapsulates a flow run. I find myself having to comment that it is a 
flow run constantly.

Would there be an appetite for renaming this class to {{FlowRunEntity}} as part 
of this? The impact should be minimal. Let me know your thoughts.

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14724393#comment-14724393
 ] 

Li Lu commented on YARN-4074:
-

bq. Would there be an appetite for renaming this class to FlowRunEntity as part 
of this? The impact should be minimal. Let me know your thoughts.
LGTM. IMO we only have flow run objects, but no actual flow objects? Even in 
flow-based offline aggregation we do not instantiate "flow" objects. This JIRA 
appears to be the right place to fix it. 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14724534#comment-14724534
 ] 

Li Lu commented on YARN-4074:
-

bq. As a general question, since we're returning our timeline entities as jsons 
in our web service, we need to some sort "rebuild" those entities on the js 
client side, right? If this is the case, we need to provide some js object 
model to be consistent with our TimelineEntity object model? I'm not a 
front-end expert so I'd like to learn the typical practice on this problem.
bq. I'm not intimately familiar with that either. I hope someone who's familiar 
could comment?

I was told by the HDFS community that they are using Dust 
(github.com/linkedin/dustjs) templates to do this. They have code available in 
our codebase as well. I'm planning to look into this framework in our POC 
(YARN-4097). 

> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-31 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14723307#comment-14723307
 ] 

Varun Saxena commented on YARN-4074:


Just had a cursory glance at the patch. A couple of points.
# We will be returning a set of {{FlowActivityEntity}} to the user. Should this 
class be in hadoop-yarn-api instead then ? Because client will have to parse 
JSON as a set of FlowActivityEntity objects.
# Should we support filtering on the basis of flow start time and probably end 
time as well ?


> [timeline reader] implement support for querying for flows and flow runs
> 
>
> Key: YARN-4074
> URL: https://issues.apache.org/jira/browse/YARN-4074
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
> Attachments: YARN-4074-YARN-2928.POC.001.patch, 
> YARN-4074-YARN-2928.POC.002.patch
>
>
> Implement support for querying for flows and flow runs.
> We should be able to query for the most recent N flows, etc.
> This includes changes to the {{TimelineReader}} API if necessary, as well as 
> implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-28 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14718120#comment-14718120
 ] 

Varun Saxena commented on YARN-4074:


bq. So improving the API would be taken up by Varun. Varun?
Yes

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch, 
 YARN-4074-YARN-2928.POC.002.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-28 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720246#comment-14720246
 ] 

Li Lu commented on YARN-4074:
-

bq. One thing I forgot to mention is that the current POC patch is a diff 
against the patch for YARN-3901, to be able to isolate the changes for this 
JIRA. 
Thanks for reminding this! I'll take a look at it shortly. 

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch, 
 YARN-4074-YARN-2928.POC.002.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717073#comment-14717073
 ] 

Varun Saxena commented on YARN-4074:


Ok..will have a look. We dont need to support a query like list all the flow 
runs for a flow ?

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717146#comment-14717146
 ] 

Sangjin Lee commented on YARN-4074:
---

cc [~gtCarrera9] and [~vrushalic] also for their thoughts.

There are some options for this, and there are pros and cons. I'm leaning 
towards the current proposal ((1) below) for now, but we could enhance this 
later as the UI jells more.

# do a specific entity query for each of the flow runs obtained from the flow 
activity entity
# return all flow runs (possibly with limits and time windows) for the given 
flow
# do a single query for all flow runs specified as a list of flow run id's

One interesting thing to note is that a flow activity entity (record) is an 
activity of that flow *for a given day*. In other words, there can be multiple 
flow activity entities for the same flow. The flow runs that are returned in 
the flow activity entity are only for that given day.

Then the question is, when I click that flow activity record, what flow runs do 
I expect to see? It's bit ambiguous, but I think it might make more sense to 
return only the flow runs that are referenced in that particular day if we're 
using the flow activity to render the landing page.

If we assume that, then (2) is probably not needed for this. Then it leaves us 
with (1) or (3). The benefit of (1) is that it fits easily into the existing 
reader API (getEntity). The downside is that you may need to make multiple 
reader calls to retrieve flow runs But normally the number of flow runs in a 
day for a given flow should be very small, so it might not be a big deal.

One hybrid approach may be that the REST API supports URLs based on the list 
but the web service code can make multiple reader getEntity() calls. We'd still 
need to define the form of the URLs to support that type of queries.

Thoughts?

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717257#comment-14717257
 ] 

Junping Du commented on YARN-4074:
--

Thanks for uploading a patch, [~sjlee0]! Sorry for coming late on this, but 
have a critical question on TimelineReader interface:
bq. Currently I am not planning to add new flow-specific methods to the 
TimelineReader interface.
If so , how to query lastest N records with existing getEntities() API? 
Actually, I think we should refactor existing getEntities() API before things 
get worse. It include too many parameters, and most of them are optional. This 
is very un-handy, easily cause bug and very hard to extend in future. 
Instead, we should define something like EntityFilter class to include most of 
these optional fields (include time range, topN, info/config/metric 
sub-filters, etc.) which also be extended easily for other filters in future. 
Thoughts?

Still in walking through your POC patch, more comments come after.



 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717758#comment-14717758
 ] 

Li Lu commented on YARN-4074:
-

Thank [~sjlee0]! I looked at the current POC patch and have some comments:
# In general, I'm OK with this approach. I think the current FlowEntity design 
should provide sufficient information for the web UI POC. 
# As a general question, since we're returning our timeline entities as jsons 
in our web service, we need to some sort rebuild those entities on the js 
client side, right? If this is the case, we need to provide some js object 
model to be consistent with our TimelineEntity object model? I'm not a 
front-end expert  so I'd like to learn the typical practice on this problem. 
# Please make sure, in the final patch, to change timeline schema creator so 
that we're consistent with the list of tables. Maybe we'd like to find some 
better ways to keep all these tables consistent within writer, reader and 
schema creator in future. 
# I agree with all of you guys that we may want to refactor the current 
implementation. For example, we may not want to dispatch incoming timeline 
entity to different tables by a list of if-statements (deciding which table to 
go has already caused me some confusion when working on the offline aggregator 
patch rebase)? Also, the parsing logic can also be easily isolated I believe? 
# Some changes in files like FlowActivityRowKey.java are not included in this 
patch? 


 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch, 
 YARN-4074-YARN-2928.POC.002.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717681#comment-14717681
 ] 

Li Lu commented on YARN-4074:
-

Hi [~sjlee0], so far the first option looks good to me. The upside of this is 
that it fits our web UI POC requirements fine, and it's relatively clean to 
maintain. The downside is that in order to support some complex use cases, we 
need to make some compositions. For the current stage I think it's fine and we 
can use it to bootstrap our web UI renderers. 

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch, 
 YARN-4074-YARN-2928.POC.002.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717721#comment-14717721
 ] 

Junping Du commented on YARN-4074:
--

Ok. Have a separated JIRA to track this refactor work should be fine. Thanks 
for pointing to that JIRA. 

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch, 
 YARN-4074-YARN-2928.POC.002.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-27 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14717855#comment-14717855
 ] 

Sangjin Lee commented on YARN-4074:
---

Thanks [~gtCarrera9] for your comments.

{quote}
As a general question, since we're returning our timeline entities as jsons in 
our web service, we need to some sort rebuild those entities on the js client 
side, right? If this is the case, we need to provide some js object model to be 
consistent with our TimelineEntity object model? I'm not a front-end expert so 
I'd like to learn the typical practice on this problem.
{quote}
I'm not intimately familiar with that either. I hope someone who's familiar 
could comment?

I'm going to do some refactoring to move away from the if-else branch (yuck). 
There are aspects such as input validation, getting results from HBase, and 
creating the entity objects that can be isolated more clearly. I need to give 
some more thoughts on how to encapsulate that more clearly. This has some 
bearing on the filter-related work that Varun is doing, so I'll try not to 
touch that area in this JIRA.

One thing I forgot to mention is that the current POC patch is a diff against 
the patch for YARN-3901, to be able to isolate the changes for this JIRA. The 
patch for YARN-3901 needs to be reviewed and committed before this can be. 
That's why this patch is missing what's included in the YARN-3901 patch.

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
 Attachments: YARN-4074-YARN-2928.POC.001.patch, 
 YARN-4074-YARN-2928.POC.002.patch


 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-26 Thread Vrushali C (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715208#comment-14715208
 ] 

Vrushali C commented on YARN-4074:
--

My take is that we can make things as generic as possible but we should have 
separate apis for flows and flow runs. 

I had put up an initial proposal for flow based queries in ATS when we started 
off on this at 
https://issues.apache.org/jira/secure/attachment/12695071/Flow%20based%20queries.docx
 

I believe for the two queries you have listed above [~sjlee0], there would be 
two rest apis as:

1) Get All Flows
Path: /listFlows/cluster/
Returns: paginated list of apps with aggregated stats (to populate the flows 
list tab on the UI)
Sample URL:
http://timelineservice.example.com/ws/v2/listFlows/clusterid?limit=2startTime=20140510endTime=20140601
This would be an UI related aggregation query

2) Get specific Flow's runs
Path: /flow/cluster/user/flowName/[version]
Returns: list of flows
Sample URL: 
http://timelineservice.example.com/ws/v2/flow/clusterid/userName/someFlowName_idenitying_a_flow?limit=2startTime=1390939248000endTime=139361764800




 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee

 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14715765#comment-14715765
 ] 

Sangjin Lee commented on YARN-4074:
---

I am about 90% done with the POC patch for this. I'm shooting for some time 
tomorrow to be able to post the patch.

In the meantime, in order to enable [~varun_saxena] and others to make 
progress, the following is the proposal that I'm implementing. Please *do* let 
me know if you have any questions or issues with the proposal so we can adjust 
accordingly.

(REST API)
In order to support the POC UI, we will implement 2 new queries:
# given the cluster, return the N most recent flows from the flow activity table
# given the cluster, user, flow id, and flow run id, return the flow run (with 
metrics) from the flow run table

At the REST level, they can be represented as follows for example:
# /listFlows/clusterId?limit=100
# /flow/clusterId/userId/flowName/flowRun

(UI)
With these URLs, the UI can invoke the first URL to render the landing page 
with the table. The REST output contains the flow activity records along with 
all the flow runs that were active during the day.

If the user drills down on a single flow, then the client side can generate the 
second queries against all the flow runs for that flow to fetch the metrics at 
the flow run level.

If the user further drills down into a single flow run, then it can do a 
(existing) query to retrieve all applications for a given flow run to get the 
application entities.

(reader interface)
Currently I am *not* planning to add new flow-specific methods to the 
{{TimelineReader}} interface. Instead, you can use the existing 
{{getEntities()}} and {{getEntity()}} methods to perform the above new queries:
# {{getEntities()}} with cluster specified and entity type = YARN_FLOW_ACTIVITY 
(a new timeline entity type)
# {{getEntity()}} with cluster, user, flow id, flow run id specified and entity 
type = YARN_FLOW


 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee

 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-24 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709839#comment-14709839
 ] 

Sangjin Lee commented on YARN-4074:
---

The queries we will need to support are as follows (let me know if you believe 
it's not accurate):

- given cluster, query the most recent N flows (from the flow activity table)
- (optionally) given cluster, user, flow id, query all flow runs

In terms of the implementation, there are two approaches. We can either define 
specific methods for querying for flow and flow runs, and implement them, or 
reuse the {{getEntities()}} method to implement them.

With the former approach, we might be having a proliferation of methods that 
are specific to types. On the other hand with the latter, the API may remain 
clean but the implementation would become messier with more if-else type of 
code.

Personally I'm slightly leaning towards the latter, but I'd love others' 
opinion.

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee

 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-24 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710102#comment-14710102
 ] 

Li Lu commented on YARN-4074:
-

I'd incline to use the latter approach to retrieve flows and flow runs, since 
we don't actually differentiate them on the backend. I also incline to keep the 
RESTful API layer simple, and to wrap it with a js native library for web UIs. 
In this way we can separate the process of TimelineEntity retrieval and the 
context of the timeline entities (e.g. is it a flow, or a application, or a DAG 
of applications?). It's also much easier to maintain this interface IMO. 

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee

 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-24 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710142#comment-14710142
 ] 

Li Lu commented on YARN-4074:
-

bq. This also implies that the canonical stores for the flows and the flow runs 
are the flow activity table and the flow run table respectively...
Ah right... This makes the unified interface less appealing since we may need 
to branch a lot with the getEntities method. However, if we proceed in this 
direction, maybe we'd like to do a deeper refactor of that code. 

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee

 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4074) [timeline reader] implement support for querying for flows and flow runs

2015-08-24 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710133#comment-14710133
 ] 

Sangjin Lee commented on YARN-4074:
---

Actually the backend will need to differentiate the queries for flows and flow 
runs from those for other entities, right? For the HBase backend, queries for 
the flows will need to be sent to the flow activity table, those for the flow 
runs will be sent to the flow run table. This also implies that the canonical 
stores for the flows and the flow runs are the flow activity table and the flow 
run table respectively...

We've already gone that way a little bit with the application table, and we 
need to be comfortable with that for us to implement the latter approach.

 [timeline reader] implement support for querying for flows and flow runs
 

 Key: YARN-4074
 URL: https://issues.apache.org/jira/browse/YARN-4074
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee

 Implement support for querying for flows and flow runs.
 We should be able to query for the most recent N flows, etc.
 This includes changes to the {{TimelineReader}} API if necessary, as well as 
 implementation of the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)