[jira] [Comment Edited] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application
[ https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15675693#comment-15675693 ] Varun Saxena edited comment on YARN-5739 at 11/18/16 4:29 AM: -- [~vrushalic], FirstKeyOnlyFilter will return the first KV from each row and KeyOnlyFilter only the key going by the description for each filter. So shouldn't KeyOnlyFilter be enough ? I had removed FirstKeyOnlyFilter and then ran the tests which Li had written and those passed. bq. This filter is used to limit the number of results to a specific page size. So it will terminate the scanning once the number of filter-passed rows is > the given page size on that particular Region Server. Which should be fine I guess. We apply limit (coming as a query param on reader side) using this filter on the reader side elsewhere as well. Because we only need one row. Even if we get one row per Region Server it will be a superset and once result set is created it will be sorted to ensure we get keys in order and we will fetch only the first one. However setCaching should be fine in our use case. But not sure why we are not using it to apply limit and using PageFilter instead. Do you know pros and cons of one over other ? was (Author: varun_saxena): [~vrushalic], FirstKeyOnlyFilter will return the first KV from each row and KeyOnlyFilter only the key going by the description for each filter. So shouldn't KeyOnlyFilter be enough ? bq. This filter is used to limit the number of results to a specific page size. So it will terminate the scanning once the number of filter-passed rows is > the given page size on that particular Region Server. Which should be fine I guess. We apply limit (coming as a query param on reader side) using this filter on the reader side elsewhere as well. Because we only need one row. Even if we get one row per Region Server it will be a superset and once result set is created it will be sorted to ensure we get keys in order and we will fetch only the first one. However setCaching should be fine in our use case. But not sure why we are not using it to apply limit and using PageFilter instead. Do you know pros and cons of one over other ? > Provide timeline reader API to list available timeline entity types for one > application > --- > > Key: YARN-5739 > URL: https://issues.apache.org/jira/browse/YARN-5739 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5739-YARN-5355.001.patch, > YARN-5739-YARN-5355.002.patch > > > Right now we only show a part of available timeline entity data in the new > YARN UI. However, some data (especially library specific data) are not > possible to be queried out by the web UI. It will be appealing for the UI to > provide an "entity browser" for each YARN application. Actually, simply > dumping out available timeline entities (with proper pagination, of course) > would be pretty helpful for UI users. > On timeline side, we're not far away from this goal. Right now I believe the > only thing missing is to list all available entity types within one > application. The challenge here is that we're not storing this data for each > application, but given this kind of call is relatively rare (compare to > writes and updates) we can perform some scanning during the read time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application
[ https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650551#comment-15650551 ] Varun Saxena edited comment on YARN-5739 at 11/9/16 10:26 AM: -- bq. WRT caching, I am wondering, if there might be a query coming next for the details of these entity types? Caching is set per Scan. Right ? Anyways alternatively PageFilter can be used to retrieve only one record per RS from backend. was (Author: varun_saxena): bq. WRT caching, I am wondering, if there might be a query coming next for the details of these entity types? Caching is set per Scan. Right ? Anyways alternatively PageFilter can be used to retrieve only one record from backend. > Provide timeline reader API to list available timeline entity types for one > application > --- > > Key: YARN-5739 > URL: https://issues.apache.org/jira/browse/YARN-5739 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5739-YARN-5355.001.patch > > > Right now we only show a part of available timeline entity data in the new > YARN UI. However, some data (especially library specific data) are not > possible to be queried out by the web UI. It will be appealing for the UI to > provide an "entity browser" for each YARN application. Actually, simply > dumping out available timeline entities (with proper pagination, of course) > would be pretty helpful for UI users. > On timeline side, we're not far away from this goal. Right now I believe the > only thing missing is to list all available entity types within one > application. The challenge here is that we're not storing this data for each > application, but given this kind of call is relatively rare (compare to > writes and updates) we can perform some scanning during the read time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application
[ https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649165#comment-15649165 ] Vrushali C edited comment on YARN-5739 at 11/8/16 11:21 PM: Thanks [~gtCarrera9] for the patch. I wish to add to [~varun_saxena]'s review suggestions. - agree with suggestion to rename the rest endpoint, but I think we should not use "-" in the rest endpoint string. So perhaps something like {noformat} /apps/{appid}/entitytypes {noformat} - Yes we need not set max versions at L159 in EntityTypeReader - WRT caching, I am wondering, if there might be a query coming next for the details of these entity types? For the scan/filter, I have a different suggestion: Looks like we want to return only entity types. Entity types are part of row keys, we don't need the column qualifiers and values in that case. So we can consider using the KeyOnlyFilter filter. This is a filter that will only return the key component of each KV (the value will be rewritten as empty). This filter can be used to grab all of the keys without having to also grab the values. When performing a table scan where only the row keys are needed (no families, qualifiers, values or timestamps), to use this, add a FilterList with a MUST_PASS_ALL operator to the scanner using setFilter. The filter list should include both a FirstKeyOnlyFilter and a KeyOnlyFilter. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html was (Author: vrushalic): Thanks [~gtCarrera9] for the patch. I wish to add to [~varun_saxena]'s review suggestions. - agree with suggestion to rename the rest endpoint, but I think we should not use "-" in the rest endpoint string. So perhaps something like {preformat} /apps/{appid}/entitytypes {preformat} - Yes we need not set max versions at L159 in EntityTypeReader - WRT caching, I am wondering, if there might be a query coming next for the details of these entity types? For the scan/filter, I have a different suggestion: Looks like we want to return only entity types. Entity types are part of row keys, we don't need the column qualifiers and values in that case. So we can consider using the KeyOnlyFilter filter. This is a filter that will only return the key component of each KV (the value will be rewritten as empty). This filter can be used to grab all of the keys without having to also grab the values. When performing a table scan where only the row keys are needed (no families, qualifiers, values or timestamps), to use this, add a FilterList with a MUST_PASS_ALL operator to the scanner using setFilter. The filter list should include both a FirstKeyOnlyFilter and a KeyOnlyFilter. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html > Provide timeline reader API to list available timeline entity types for one > application > --- > > Key: YARN-5739 > URL: https://issues.apache.org/jira/browse/YARN-5739 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5739-YARN-5355.001.patch > > > Right now we only show a part of available timeline entity data in the new > YARN UI. However, some data (especially library specific data) are not > possible to be queried out by the web UI. It will be appealing for the UI to > provide an "entity browser" for each YARN application. Actually, simply > dumping out available timeline entities (with proper pagination, of course) > would be pretty helpful for UI users. > On timeline side, we're not far away from this goal. Right now I believe the > only thing missing is to list all available entity types within one > application. The challenge here is that we're not storing this data for each > application, but given this kind of call is relatively rare (compare to > writes and updates) we can perform some scanning during the read time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application
[ https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15649165#comment-15649165 ] Vrushali C edited comment on YARN-5739 at 11/8/16 11:21 PM: Thanks [~gtCarrera9] for the patch. I wish to add to [~varun_saxena]'s review suggestions. - agree with suggestion to rename the rest endpoint, but I think we should not use "-" in the rest endpoint string. So perhaps something like {preformat} /apps/{appid}/entitytypes {preformat} - Yes we need not set max versions at L159 in EntityTypeReader - WRT caching, I am wondering, if there might be a query coming next for the details of these entity types? For the scan/filter, I have a different suggestion: Looks like we want to return only entity types. Entity types are part of row keys, we don't need the column qualifiers and values in that case. So we can consider using the KeyOnlyFilter filter. This is a filter that will only return the key component of each KV (the value will be rewritten as empty). This filter can be used to grab all of the keys without having to also grab the values. When performing a table scan where only the row keys are needed (no families, qualifiers, values or timestamps), to use this, add a FilterList with a MUST_PASS_ALL operator to the scanner using setFilter. The filter list should include both a FirstKeyOnlyFilter and a KeyOnlyFilter. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html was (Author: vrushalic): Thanks [~gtCarrera9] for the patch. I wish to add to [~varun_saxena]'s review suggestions. - agree with suggestion to rename the rest endpoint, but I think we should not use "-" in the rest endpoint string. So perhaps something like {quote} /apps/{appid}/entitytypes {quote} - Yes we need not set max versions at L159 in EntityTypeReader - WRT caching, I am wondering, if there might be a query coming next for the details of these entity types? For the scan/filter, I have a different suggestion: Looks like we want to return only entity types. Entity types are part of row keys, we don't need the column qualifiers and values in that case. So we can consider using the KeyOnlyFilter filter. This is a filter that will only return the key component of each KV (the value will be rewritten as empty). This filter can be used to grab all of the keys without having to also grab the values. When performing a table scan where only the row keys are needed (no families, qualifiers, values or timestamps), to use this, add a FilterList with a MUST_PASS_ALL operator to the scanner using setFilter. The filter list should include both a FirstKeyOnlyFilter and a KeyOnlyFilter. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html > Provide timeline reader API to list available timeline entity types for one > application > --- > > Key: YARN-5739 > URL: https://issues.apache.org/jira/browse/YARN-5739 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5739-YARN-5355.001.patch > > > Right now we only show a part of available timeline entity data in the new > YARN UI. However, some data (especially library specific data) are not > possible to be queried out by the web UI. It will be appealing for the UI to > provide an "entity browser" for each YARN application. Actually, simply > dumping out available timeline entities (with proper pagination, of course) > would be pretty helpful for UI users. > On timeline side, we're not far away from this goal. Right now I believe the > only thing missing is to list all available entity types within one > application. The challenge here is that we're not storing this data for each > application, but given this kind of call is relatively rare (compare to > writes and updates) we can perform some scanning during the read time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org