[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-24 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025219#comment-15025219
 ] 

Sangjin Lee commented on YARN-3862:
---

I agree that currently the config or metric ids are used directly as the column 
names and what is being done in the patch is probably correct, and we would get 
the same result if we went with *ColumnPrefix.getColumnPrefixBytes().

I think one of the reasons that we still want to leverage *ColumnPrefix is 
because that way we're basically insulated against future changes. If we went 
with the approach that the patch proposes and the column name format for config 
or metric should change later, we would need to remember to visit 
TimelineFilterUtils and modify this method accordingly. That would be rather 
brittle.

Another interesting reason is consistency. Currently when configs and metrics 
are written, they go through ColumnHelper.getColumnQualifier() to create the 
column name bytes. ColumnHelper properly encodes them if there are spaces for 
example. It would be consistent to treat them the same way for the read path. I 
don't know that we allow spaces in config or metric names (I don't think we 
discussed that possibility), but at least that way we'd be consistent.

My proposal for doing this was using the byte array returned by

{code}
EntityColumnPrefix.CONFIG.getColumnPrefixBytes(prefix_from_the_filter)
{code}

to use as argument to the BinaryPrefixComparator constructor. We'd need to work 
out how the column prefix can be passed into TimelineFilterUtils. Hope this 
helps.

While we're at it, can we also refactor the calls to 
ColumnHelper.getColumnQualifier() in ApplicationColumnPrefix.store(), 
EntityColumnPrefix.store(), etc. to use getColumnPrefixBytes()?

bq. So prefixes in createHBaseColQualPrefixFilter() can be anything and cannot 
be fetched via a call to ColumnPrefix.getColumnPrefixBytes().

I'm not quite sure under what scenario 
ColumnPrefix.getColumnPrefixBytes(prefix_passed_by_users) would not work for 
this purpose. Could you kindly elaborate?

bq. Maybe confs and metrics can be renamed as configsToRetrieve and 
metricsToRetrieve respectively. Thoughts ?

Those sound better.

{quote}
Current code is not hooked up to the REST layer, so it wont work end to end. 
However, the current patch has already become quite big. So we can handle REST 
related changes in another JIRA. I am fine with that.
{quote}

+1. We can put that in another JIRA.

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch, 
> YARN-3862-feature-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-24 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026096#comment-15026096
 ] 

Varun Saxena commented on YARN-3862:


[~sjlee0], regarding ColumnPrefix.getColumnPrefixBytes(), I had misunderstood 
your comment earlier.
I agree with what you are saying. Will make the change accordingly.

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch, 
> YARN-3862-feature-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-24 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024390#comment-15024390
 ] 

Varun Saxena commented on YARN-3862:


{quote}
(TimelineFilterUtils.java)
createHBaseColQualPrefixFilter(): this is still trying to compute the column 
prefix by hand. The main point of introducing getColumnPrefixBytes() on 
ColumnPrefix was to avoid doing this for confs and metrics. Can we rework the 
signatures of createHBaseFilterList() so that we can rely on 
ColumnPrefix.getColumnPrefixBytes()? Ideally all computations of qualifier 
bytes should go through ColumnPrefix.getColumnPrefixBytes().
{quote}
What we are trying to do here is to convert prefixes coming in filters from 
client and try to match(prefix match) them against column qualifier.
We store config and metric names directly as column qualifiers without any 
prefix(Except in flow run prefix table). So there is no fixed column prefix for 
configs and metrics anyways.
Let us say, we have column qualifiers (in config column family) as 
mapreduce.map.java.opts, mapreduce.map.memory.mb, mapreduce.reduce.memory.mb, 
etc.
Now user may want to query all the map related configurations and may send 
{{mapreduce.map}} as prefix. But he can send an invalid prefix like 
{{mapreduce_map}} as well. So prefixes in createHBaseColQualPrefixFilter() can 
be anything and cannot be fetched via a call to 
ColumnPrefix.getColumnPrefixBytes().

bq. I'm not too sure about the name; for other tests we basically combined the 
reader and writer tests. Thoughts on how to make this best fit into the 
existing tests?
Ok. Maybe can move these tests to TestHBaseTimelineStorage. Let me see.

bq. I keep confusing configFilters and confs. 
Maybe confs and metrics can be renamed as configsToRetrieve and 
metricsToRetrieve respectively. Thoughts ?

bq. On a related note, this is probably outside the scope of this JIRA, but I 
see that the configFilter and metricFilter are applied on the client-side.
Yes. This will be handled in YARN-3863. However, event filters, relatesTo and 
isRelatedTo still need to be matched at client side because of the way these 
values are stored in our tables. We can discuss this though.

bq. l.156: Why do we need to check if configFilters == null? 
This will be removed in YARN-3863. This is done because we need to fetch 
configs if we have to match them on client side(As of now till 3863 goes in). 
However we should probably fetch all configs irrespective of confs field if 
match has to be done on client side. This is missed in this patch. This code 
will have to be removed though in 3863.

bq. Related to one of the points above, at least we should add javadoc that 
clearly explains confs and metrics
Agree. Will add.

bq. l.139: nit: typo: releated -> related
Ok.

Other comments are due to YARN-4053 going in. Will fix them in next version of 
the patch.


> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch, 
> YARN-3862-feature-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-24 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024477#comment-15024477
 ] 

Varun Saxena commented on YARN-3862:


bq. Whether we make TimelineFilter part of the object model or not, we'll still 
need to come up with a way to support filter queries on the URLs, no?
Decision about making it part of an object model was to primarily decide on how 
much control we want to give the client.
Moreover, the thought behind making it as part of object model is that the 
client will create an object of type TimelineFilterList and this will converted 
into a JSON string and sent in the query param. Something like below where 
metricFilters is a query param. This can become quite complex as a filterlist 
can have another filterlist in it but at the server side it will be easy to 
parse as JSON converter will do it for us. This though can make the URL quite 
big.
{{=\{"operator": "AND", "filters": \[\{"type": 
"COMPARE","key":"metric1", "value": "12345", "compareop": 
"GREATER_THAN\},\{"type": "COMPARE","key":"metric23", "value": "12", 
"compareop": "EQUALS\}\]\}}}
Or we can alternatively define some other way to represent this. Say something 
like below for instance. Here, we will have to do the parsing ourselves. We can 
go with acronyms like gt for greater than, eq for equals, ge for greater than 
equals and so on. As you can see below, it is exactly same query as above but 
as its not JSON representation, it will be a lot shorter.
{{=(metric1 gt 12345) AND (metric23 eq 12)}}
This is what I meant by that we have to decide whether to keep it as part of 
object model or not.

bq. I just wanted to understand whether we need to make that call as part of 
this JIRA. Did I understand this correctly, or did I miss something important?
Current code is not hooked up to the REST layer, so it wont work end to end. 
However, the current patch has already become quite big. So we can handle REST 
related changes in another JIRA. I am fine with that.


> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch, 
> YARN-3862-feature-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-23 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023527#comment-15023527
 ] 

Sangjin Lee commented on YARN-3862:
---

Whether we make TimelineFilter part of the object model or not, we'll still 
need to come up with a way to support filter queries on the URLs, no?

While we're at it, today there are no reads done through the TimelineClient 
API, correct? Today there are only the REST-based queries. Of course this 
doesn't mean we won't support more programmatic reads via TimelineClient (and 
RPC?) in the future, and also there may be value in making TimelineFilter part 
of the common API. I just wanted to understand whether we need to make that 
call as part of this JIRA. Did I understand this correctly, or did I miss 
something important?

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch, 
> YARN-3862-feature-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-23 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15023646#comment-15023646
 ] 

Sangjin Lee commented on YARN-3862:
---

I had a chance to go over the latest patch in a little more detail. I think 
this is now closer to being ready. I do have some comments and suggestions, 
some major and others minor.

(TimelineFilterUtils.java)
- createHBaseColQualPrefixFilter(): this is still trying to compute the column 
prefix by hand. The main point of introducing getColumnPrefixBytes() on 
ColumnPrefix was to avoid doing this for confs and metrics. Can we rework the 
signatures of createHBaseFilterList() so that we can rely on 
ColumnPrefix.getColumnPrefixBytes()? Ideally all computations of qualifier 
bytes should go through ColumnPrefix.getColumnPrefixBytes().

(TestHBaseTimelineReaderImpl.java)
- I'm not too sure about the name; for other tests we basically combined the 
reader and writer tests. Thoughts on how to make this best fit into the 
existing tests?

(GenericEntityReader.java)
- l.139: nit: typo: releated -> related
- I keep confusing configFilters and confs. The names are so similar that I 
have to go check the implementations to distinguish them (configFilters 
filtering rows we want to return, and confs filters contents of the matching 
rows). Could there be a better way to name them so that their meanings are 
clearer? I don't have a great idea at the moment, and you might want to think 
about better names...
- On a related note, this is probably outside the scope of this JIRA, but I see 
that the configFilter and metricFilter are applied on the client-side. Probably 
on a separate JIRA, we should see if we can do this on the HBase side. This is 
just a reminder.
- l.156: Why do we need to check if configFilters == null? Is it because if 
configFilters are specified we implicitly assume we want the config columns 
returned in the content? Is that a valid assumption?

(TimelineReader.java)
- Related to one of the points above, at least we should add javadoc that 
clearly explains confs and metrics and how they are different from 
configFilters and metricFilters. That will help us a great deal in maintaining 
this.

(FlowRunColumnPrefix.java)
- As a result of YARN-4053 being committed, getColumnPrefixBytes(String) 
already exists. It should be removed from this patch.

(TestHBaseStorageFlowRun.java)
- testWriteFlowRunMetricsPrefix() and testWriteFlowRunsMetricFields() are 
failing possibly due to changes in YARN-4053.



> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch, 
> YARN-3862-feature-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001470#comment-15001470
 ] 

Hadoop QA commented on YARN-3862:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
56s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
47s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 
failed with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s 
{color} | {color:red} Patch generated 43 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice
 (total was 102, now 128). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 55s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 22s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
32s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 49s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-12 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12771888/YARN-3862-feature-YARN-2928.wip.03.patch
 |
| JIRA Issue | YARN-3862 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux 530c24bb2676 

[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15001484#comment-15001484
 ] 

Hadoop QA commented on YARN-3862:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
5s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
50s {color} | {color:green} feature-YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 
failed with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s 
{color} | {color:red} Patch generated 43 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice
 (total was 102, now 128). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 49s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 25s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed 
with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
33s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 5s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-12 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12771888/YARN-3862-feature-YARN-2928.wip.03.patch
 |
| JIRA Issue | YARN-3862 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux 54f9ecd19690 

[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-09 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997161#comment-14997161
 ] 

Varun Saxena commented on YARN-3862:


Attached a WIP patch.

This patch attempts to do handling for all the tables(creation of filter list 
based on fields) and do prefix matching for configs and filters. Previous WIP 
patch was only attempting to handle for entity table because of the 
implications of this patch on config and metric filters' matching. Have handled 
this scenario in this patch. 
When YARN-3863 is done, some changes will be warranted though(some conditions 
to pass config and metric filters will have to be removed). 
Have added a few tests to test the change as well.

I have still not hooked up this code to REST API layer.
For that, we first need to decide as to whether the TimelineFilter code will be 
part of our object model or not.
For prefix matching of configs and metrics to return, at the REST layer this 
can simply come as a query param (a comma separated list)

But when we code for complex filters (especially metric filters) in YARN-3863 
we will have to support SQL type queries with ANDs', ORs', >,<,=operators, etc.
If we make TimelineFilter as part of our client object model and interpret 
filters as a JSON string associated with a query param, we might have to 
rethink a bit on few of the classes and  including additional checks(as this 
will be used by client).
This can increase size of the URL though.

If we do not include filter as part of our object model, we will have to decide 
how to specify complex config and metric filters containing ANDs' and ORs' and 
different relational operators(because some of the symbols will be reserved) 
and reach a consensus on that.


> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch, YARN-3862-YARN-2928.wip.03.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14997153#comment-14997153
 ] 

Hadoop QA commented on YARN-3862:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 6m 21s 
{color} | {color:red} root in YARN-2928 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 2m 32s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed 
with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed 
with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
12s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 9s {color} 
| {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 13s {color} 
| {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 10s 
{color} | {color:red} Patch generated 43 new checkstyle issues in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice
 (total was 102, now 128). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) with tabs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 11s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 9s {color} | 
{color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 12s {color} 
| {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 18s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 24s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-09 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-07 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14995553#comment-14995553
 ] 

Varun Saxena commented on YARN-3862:


Agree

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-06 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14994758#comment-14994758
 ] 

Sangjin Lee commented on YARN-3862:
---

[~jrottinghuis] and I went over the patch in some more detail, and have a few 
high level suggestions.

I don't think the qualifiers that are being created currently in 
TimelineEntityReader.constructFilterListBasedOnFields() are quite right. We 
already talked about breaking down and pushing down the logic into its 
appropriate specific entity reader implementations. In addition, instead of 
trying to compute the byte arrays using the raw ingredients like Separators, we 
should rely on the \*ColumnPrefix classes to give you the byte arrays. That 
would lead to more properly encapsulated (and correct) code.

ColumnPrefix classes already do something like the following (see 
ApplicationColumnPrefix.store() for example):
{code}
byte[] columnQualifier =
ColumnHelper.getColumnQualifier(this.columnPrefixBytes, qualifier);
{code}

We could expose a new method on ColumnPrefix like
{code}
public interface ColumnPrefix {
  ...
  byte[] getColumnQualifierBytes(String qualifier);
  ...
}
{code}
And specific implementations can implement that method. That way, all the 
proper column prefix handling is managed and encapsulated by ColumnPrefix 
classes.

When we move the logic of creating the filter list to its appropriate entity 
reader classes, those classes already know which column prefix they're dealing 
with, and they can simply call these methods to get the bytes back. That will 
make the implementation much cleaner.

Hope this helps. Let me know if you have any questions... Thanks!

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-03 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987829#comment-14987829
 ] 

Allen Wittenauer commented on YARN-3862:


Yetus is happily pointing out how many build fixes your branch is missing 
and/or have yet to be committed (like HADOOP-12492).  So I'd recommend a rebase 
or some large scale cherry-picking.

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987669#comment-14987669
 ] 

Varun Saxena commented on YARN-3862:


Yes, you are correct. Will fix this in next patch

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-03 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987636#comment-14987636
 ] 

Sangjin Lee commented on YARN-3862:
---

[~aw], sorry to bother you here, but our latest jenkins run is an unmitigated 
failure full of incomprehensible errors (projects banned, unrecognizable 
dependencies showing up out of nowhere, etc.). Would you be able to share what 
is going on here and how we can restore this? Thanks.

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-03 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987902#comment-14987902
 ] 

Sangjin Lee commented on YARN-3862:
---

Thanks. We'll discuss when we should address the next rebase. Thanks for 
checking.

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-02 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986444#comment-14986444
 ] 

Sangjin Lee commented on YARN-3862:
---

I just kicked off a jenkins run on the latest patch.

Sorry [~varun_saxena] it took me a while to get around to looking at the patch. 
The overall approach seems pretty reasonable to me. I'll need to go over the 
patch in some detail, however.

One point I'd like to make is regarding 
{{TimelineEntityReader.constructFilterListBasedOnFields()}}. I see it using 
{{EntityColumnFamily}} and {{EntityColumnPrefix}}. I don't think that's quite 
right. In terms of the class hierarchy {{TimelineEntityReader}} sits below 
{{GenericEntityReader}} (which deals with the generic entity table). As such, 
it should be agnostic to the actual specific tables. The 
{{TimelineEntityReader.constructFilterListBasedOnFields()}} method should 
contain only the most generic implementation (which may well be returning 
null). Any logic that deals with the entity columns should belong in 
{{GenericEntityReader}}.

This also points to an issue with {{ApplicationEntityReader}}. Its 
{{constructFilterListBasedOnFields()}} method needs to be implemented in terms 
of application column family and application column prefix. So it needs to be 
properly overridden in that class.

> Decide which contents to retrieve and send back in response in TimelineReader
> -
>
> Key: YARN-3862
> URL: https://issues.apache.org/jira/browse/YARN-3862
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
> Attachments: YARN-3862-YARN-2928.wip.01.patch, 
> YARN-3862-YARN-2928.wip.02.patch
>
>
> Currently, we will retrieve all the contents of the field if that field is 
> specified in the query API. In case of configs and metrics, this can become a 
> lot of data even though the user doesn't need it. So we need to provide a way 
> to query only a set of configs or metrics.
> As a comma spearated list of configs/metrics to be returned will be quite 
> cumbersome to specify, we have to support either of the following options :
> # Prefix match
> # Regex
> # Group the configs/metrics and query that group.
> We also need a facility to specify a metric time window to return metrics in 
> a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-11-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986522#comment-14986522
 ] 

Hadoop QA commented on YARN-3862:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 6s 
{color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 6m 13s 
{color} | {color:red} root in YARN-2928 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 1m 16s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed 
with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed 
with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
7s {color} | {color:green} YARN-2928 passed {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 7s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed 
with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-yarn-server-timelineservice in YARN-2928 failed 
with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 6s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 6s {color} 
| {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 8s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 8s {color} 
| {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
6s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvneclipse {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 9s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 7s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed 
with JDK v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 7s {color} | 
{color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 9s {color} | 
{color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK 
v1.7.0_79. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 15s 
{color} | {color:red} Patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 46s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 
Image:test-patch-base-hadoop-date2015-11-03 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12769279/YARN-3862-YARN-2928.wip.02.patch
 |
| JIRA Issue | YARN-3862 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  
checkstyle  compile  |
| uname | Linux 

[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-28 Thread Joep Rottinghuis (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14720667#comment-14720667
 ] 

Joep Rottinghuis commented on YARN-3862:


If we need to retrieve exactly known columns (and in addition we know if it is 
a metric, or a config value etc) then we can add these to the scan (or get) 
directly through
{code}
addColumn(byte [] family, byte [] qualifier)
{code}

For ColumnPrefixFilter is also clear. That is just restricting which rows are 
returned (it filters the keys).
The confusion starts with org.apache.hadoop.hbase.filter.QualifierFilter. That 
can be used to retrieve only some columns, specifically when combined with a 
WhileMatchFilter.

In addition we have the consideration whether we want to push these limits down 
to HBase (which is preferable) or whether we want to just pull back everything 
from HBase and restrict what we serialize in the result.

I think it would be cleaner to have a direct separate API (method argument) to 
be able to specify which columns to retrieve. If we then add specific values to 
the scan, or prefix patterns to a filter is up to the implementation.

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706566#comment-14706566
 ] 

Varun Saxena commented on YARN-3862:


bq. As for the timeline filters, strictly speaking these are filters that 
filter based on the column qualifiers, and not on the values, right? 
In this JIRA filters based on column qualifiers will be handled. In YARN-3863, 
based on values.

bq. I chatted with Joep on this, and I personally feel that it would be useful 
to maintain separation between limiting the contents that are returned (akin to 
contents of SELECT in SQL) and limiting the rows that are selected (akin to the 
WHERE clause in SQL).
Currently TimelineFilterList will accept any subclass of TimelineFilter. There 
is no strict segregation with regards to column based filters and value based 
filters.
The idea was that as we will be creating the filters inside ATS code, it will 
be taken care that no mix and match will be there.
But if we add filters as part of our object model, I think we can clearly 
segregate it. 
We can have 2 separate interfaces or abstract classes(sort of facades) 
extending the TimelineFilter interface which do nothing. And the column based 
filters can extend one and value based filters other.
In TimelineFilterList, there can be 2 addFilter interfaces for these 2 classes. 
This way a list will contain only one class of filters.


 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-21 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707073#comment-14707073
 ] 

Sangjin Lee commented on YARN-3862:
---

I understand that the distiction between what selects rows and what selects 
columns in HBase is not as strong. I think it has more to do with the mental 
model of whoever is using the API than anything.

If this API is not exposed to the client, then this distinction is probably bit 
less important because the consumer is basically us. But even for us, I think 
it would be good to have some separation so we can reason about this without 
much confusion.

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-20 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706052#comment-14706052
 ] 

Li Lu commented on YARN-3862:
-

bq.  it would be useful to maintain separation between limiting the contents 
that are returned (akin to contents of SELECT in SQL) and limiting the rows 
that are selected (akin to the WHERE clause in SQL).
I agree we should distinguish those two use cases. Restricting our filters to 
be predicates on rows will work perfectly for relational databases (and launch 
SQL queries), but if we storage data in our current fashion, we may also need 
to dynamically filter some columns I assume? For example, we may have a 
column filter that selects all configs that starts with 
yarn.timelineservice.. I think most of these column filters will work on 
column qualifiers but not the values. 

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-20 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705986#comment-14705986
 ] 

Sangjin Lee commented on YARN-3862:
---

Sorry it took me awhile to get around to looking at this.

As for the timeline filters, strictly speaking these are filters that filter 
based on the column qualifiers, and *not* on the values, right? Or are we 
combining both types of filtering here? IMO, it would be good to limit this to 
filtering of columns only for column qualifiers and not do the values. I think 
those 2 things are conceptually separate, and would cause confusion if they're 
mixed.

The reason I ask that is the patch has comparison filters 
({{TimelineCompareFilter}}) and operators that are related to comparisons. I'm 
not sure how they relate to the filtering based on the column qualifiers. So 
far we're talking about prefix match for the most part...

On a similar note, how about the filter based on the limit as suggested by 
[~gtCarrera9]? Are we also mixing concepts there? The filters that are 
mentioned here do not select rows but rather pick out *contents* to return 
(i.e. columns or cells), whereas the limit filter would be selecting rows. I 
chatted with Joep on this, and I personally feel that it would be useful to 
maintain separation between limiting the contents that are returned (akin to 
contents of SELECT in SQL) and limiting the rows that are selected (akin to the 
WHERE clause in SQL).

Thoughts?

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-14 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697654#comment-14697654
 ] 

Varun Saxena commented on YARN-3862:


[~gtCarrera9], these 2 JIRAs' were raised separately to address following areas 
:
# Enhance already supported filters (YARN-3863) to filter out rows of data. By 
adding support for OR in addition to AND and relational ops for metrics.  Scope 
for this JIRA is pretty clear.
# Restrict the amount of data retrieved(from columns) in this JIRA. In this 
JIRA, we actually wanted to have a discussion on what all we need to support. 
Regex, prefix match , etc. Also whether we want to retrieve metrics by time 
windows as well.

I am open to realigning these JIRAs' and distributing the work along the lines 
of the workflow you mentioned above.
My only concern with deciding with a filter object model though will be that we 
may take a lot of time deciding it to cover all the scenarios. Because support 
for additional filters may come up during further discussion. 
Let's do as per whatever is the consensus.

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-14 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697663#comment-14697663
 ] 

Varun Saxena commented on YARN-3862:


bq. My feeling is that the concept of timeline filter may become a part of our 
object model, so that client users can easily communicate?
Do we want to expose it to the client ? Suggestion sounds good. That wasnt the 
plan but if everyone agrees, lets have it that way.

bq. are we treating our timeline filters as pure-data objects (models) 
Yes I am as of now treating them as pure data objects. That is why instead of 
using polymorphism and converting the filter to HBase Filter by providing a 
method for conversion in the filter class(es), I kept the conversion in util 
class. The intention was to decouple filters from storage implementation.

bq. is it easy, or possible, for us to implement a paging filter? 
Will look into it.

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-14 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697667#comment-14697667
 ] 

Varun Saxena commented on YARN-3862:


BTW, did not upload a WIP patch for YARN-3863 due to issue raised in YARN-4053

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697840#comment-14697840
 ] 

Li Lu commented on YARN-3862:
-

bq. That is why instead of using polymorphism and converting the filter to 
HBase Filter by providing a method for conversion in the filter class(es), I 
kept the conversion in util class. The intention was to decouple filters from 
storage implementation.
I agree with this approach. Meanwhile we may also want to restrict the range of 
the util class. Instead of making them in TimelineReaderUtils, feel free to add 
something like HBaseFilterConverter to model the filter conversion logic. 

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697845#comment-14697845
 ] 

Li Lu commented on YARN-3862:
-

Oh and, BTW, I thinks it's pretty much fine on the code side, so please feel 
free to proceed this JIRA as planed. Thanks! 

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-14 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697829#comment-14697829
 ] 

Li Lu commented on YARN-3862:
-

I'm not worrying about having a filter object model will slow down everything. 
Sure, we may not cover everything in the first draft, or even in the first 
JIRA. However, if we know we're on the right track we're making progress. If we 
realize any use case limitations we can always fix them later, but at this 
early stage let's first have the right framework and get our planned goals 
done. 

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-13 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14696131#comment-14696131
 ] 

Li Lu commented on YARN-3862:
-

Hi [~varun_saxena], thanks for working on this! After looking at the WIP patch, 
I have some general questions and think some discussions will definitely be 
helpful. 

My first confusion is about the relationship between this JIRA and YARN-3863. 
Both JIRAs appear to be related to timeline filter designs. However, if we'd 
like to support filters in timeline v2, I think a natural workflow may be:
# Filter object model and support them in FS storage implementation for tests. 
# Supporting timeline filters in our HBase reader
# Connecting the filter implementation to our existing web services interface
Each steps may not necessarily be a separate JIRA. If I understand correctly 
we're including some changes in step 1 and 2 in this JIRA, and you're planning 
to add WS support later. So I think this will nicely address our current 
request on timeline filters. What is the plan for YARN-3863 then? Or I'm 
missing some major piece? 

Secondly it's about the role of our planned filters. I notice you're putting 
filters in a package within timeline reader. My feeling is that the concept of 
timeline filter may become a part of our object model, so that client users can 
easily communicate? 

A follow up question from this is, are we treating our timeline filters as 
pure-data objects (models) or something that include both filter data and the 
way we actually filter (models+algorithm)? I slightly prefer the former because 
this may decouple the filter information (in timeline server land) and the 
storage implementations. I agree our filter design should at least slightly 
favor HBase filters, but converting a timeline filter to an HBase filter should 
be restricted in the land of the actual storage. 

Finally a quick question: is it easy, or possible, for us to implement a 
paging filter? We have a lot of web UI use cases that may need such a filter. 

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-12 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693605#comment-14693605
 ] 

Varun Saxena commented on YARN-3862:


Attached a WIP patch so that we can further discuss.

The patch does the following :
# Adds some filters along the lines of HBase Filters. 
# Uses family and qualifier filters to reduce the amount of data fetched from 
backend. We do not need to get data related to all the column families(and 
qualifiers) if fields to retrieve does not contain them.
# This WIP patch supports prefix filters to fetch specific metrics and configs. 

What we further need to decide is as under :
# The filter interface added is primarily a placeholder and all the filter 
classes act as data classes. Currently the patch converts these filters into 
HBase Filters in a util class(based on filter type). But we can use 
polymorphism  as well to let each filter class generate an analogous HBase 
Filter. However, I am not sure if we would want storage related(HBase) code 
logic these filter classes. Let us decide here and do what the consensus is.
# We also need to decide as to what all we need to support with regards to 
configs and metrics we want to retrieve. Do we support substring and regex as 
well ?
# REST API related code will be added once YARN-3814 goes in.
# Please note that this needs to work in conjunction with changes wrt to config 
and metric filters so that certain data is not missed. So, this JIRA is 
somewhat dependent on work in YARN-3863 as well.


 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693646#comment-14693646
 ] 

Hadoop QA commented on YARN-3862:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12750090/YARN-3862-YARN-2928.wip.01.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1c12adb |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8835/console |


This message was automatically generated.

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena
 Attachments: YARN-3862-YARN-2928.wip.01.patch


 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-06-29 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605401#comment-14605401
 ] 

Varun Saxena commented on YARN-3862:


As HBase supports prefix match, we will initially go with that. Need to see if 
even regex needs to be supported.
For configs, most of our configs are sort of grouped together in a manner that 
prefix match will be useful. For instance all RM configs are start with 
{{yarn.resourcemanager}}, task configs start with {{mapreduce.task}} and so on.
For metrics though, there is no logical grouping based on metric name. Need to 
see if we need it or not.

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena

 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3862) Decide which contents to retrieve and send back in response in TimelineReader

2015-06-29 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605406#comment-14605406
 ] 

Varun Saxena commented on YARN-3862:


For metrics time window, the question is whether to specify this time window 
per metric or specify it globally for all the metrics. From the usage point of 
view of the API, we went with single time window for all the metrics in initial 
YARN-3051 patches. We can reach a conclusion on that as well.

 Decide which contents to retrieve and send back in response in TimelineReader
 -

 Key: YARN-3862
 URL: https://issues.apache.org/jira/browse/YARN-3862
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Varun Saxena
Assignee: Varun Saxena

 Currently, we will retrieve all the contents of the field if that field is 
 specified in the query API. In case of configs and metrics, this can become a 
 lot of data even though the user doesn't need it. So we need to provide a way 
 to query only a set of configs or metrics.
 As a comma spearated list of configs/metrics to be returned will be quite 
 cumbersome to specify, we have to support either of the following options :
 # Prefix match
 # Regex
 # Group the configs/metrics and query that group.
 We also need a facility to specify a metric time window to return metrics in 
 a that window. This may be useful in plotting graphs 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)