[jira] [Commented] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024312#comment-16024312
 ] 

Bingxue Qiu commented on YARN-6645:
---

hi, [~cheersyang], we backport the YARN-1503 to hadoop 2.8 in our clusters. for 
this exception, we create the nmPrivateDir in writeScriptToNMPrivateDir method 
like this, please feel free to give me some suggestion, Thank you!

 private File writeScriptToNMPrivateDir(String nmPrivateDir, String command)
  throws IOException {
File file = new File(nmPrivateDir);
if (!file.mkdirs()) {
  if (!file.exists()) {
LOG.error("Failed to create nmPrivate dir " + file);
  }
}

File tmp = File.createTempFile("cmd_", "_tmp", new File(nmPrivateDir));
Writer writer = new OutputStreamWriter(new FileOutputStream(tmp), "UTF-8");
PrintWriter printWriter = new PrintWriter(writer);
printWriter.print(command);
printWriter.close();
return tmp;
  }

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
> Attachments: error when creating symlink.png
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6645:
--
Attachment: error when creating symlink.png

add the error logs when creating symlink

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
> Attachments: error when creating symlink.png
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16024302#comment-16024302
 ] 

Bingxue Qiu commented on YARN-6645:
---

hi  [~cheersyang] , i will upload the logs and patch later, Thank you!

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6645:
--
Fix Version/s: 2.9.0

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-25 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6645:
--
Description: when creating symlink after the resource localized in our 
clusters , an IOException has been thrown, because the nmPrivateDir doesn't 
exist. we add a patch to fix it.

> Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor
> ---
>
> Key: YARN-6645
> URL: https://issues.apache.org/jira/browse/YARN-6645
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Bingxue Qiu
>
> when creating symlink after the resource localized in our clusters , an 
> IOException has been thrown, because the nmPrivateDir doesn't exist. we add a 
> patch to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6645) Bug fix in ContainerImpl when calling the symLink of LinuxContainerExecutor

2017-05-24 Thread Bingxue Qiu (JIRA)
Bingxue Qiu created YARN-6645:
-

 Summary: Bug fix in ContainerImpl when calling the symLink of 
LinuxContainerExecutor
 Key: YARN-6645
 URL: https://issues.apache.org/jira/browse/YARN-6645
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Bingxue Qiu






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6624) The implementation of getLocalizationStatus

2017-05-24 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6624:
--
Attachment: YARN-6624.1.patch

add the YARN-6624.1.patch

> The implementation of getLocalizationStatus
> ---
>
> Key: YARN-6624
> URL: https://issues.apache.org/jira/browse/YARN-6624
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Bingxue Qiu
> Attachments: YARN-6624.1.patch
>
>
> We have a use case, where the client need to know the state of localization 
> resources, With the design of [Continuous-resource-localization | 
> https://issues.apache.org/jira/secure/attachment/12825041/Continuous-resource-localization.pdf]
>  , we choose to include it as part of
> ContainerStatus.
> Proposal:
> When using the getContainerStatus, we can check the state by 
> pendingResources,resourcesFailedToBeLocalized in ResourceSet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6606) The implementation of LocalizationStatus in ContainerStatusProto

2017-05-24 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6606:
--
Attachment: YARN-6606.2.patch

add the YARN-6606.2.patch

> The implementation of LocalizationStatus in ContainerStatusProto
> 
>
> Key: YARN-6606
> URL: https://issues.apache.org/jira/browse/YARN-6606
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
> Attachments: YARN-6606.1.patch, YARN-6606.2.patch
>
>
> we have a use case, where the full implementation of localization status in 
> ContainerStatusProto 
> [Continuous-resource-localization|https://issues.apache.org/jira/secure/attachment/12825041/Continuous-resource-localization.pdf]
>need to be done , so we make it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6624) The implementation of getLocalizationStatus

2017-05-18 Thread Bingxue Qiu (JIRA)
Bingxue Qiu created YARN-6624:
-

 Summary: The implementation of getLocalizationStatus
 Key: YARN-6624
 URL: https://issues.apache.org/jira/browse/YARN-6624
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.9.0
Reporter: Bingxue Qiu
 Fix For: 2.9.0


We have a use case, where the client need to know the state of localization 
resources, With the design of [Continuous-resource-localization | 
https://issues.apache.org/jira/secure/attachment/12825041/Continuous-resource-localization.pdf]
 , we choose to include it as part of
ContainerStatus.
Proposal:
When using the getContainerStatus, we can check the state by 
pendingResources,resourcesFailedToBeLocalized in ResourceSet.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1503) Support making additional 'LocalResources' available to running containers

2017-05-16 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012024#comment-16012024
 ] 

Bingxue Qiu commented on YARN-1503:
---

hi,[~jianhe] we have a use case, where the full implementation of localization 
status in ContainerStatusProto  need to be done , so we make it. please feel 
free to  give some advice , thx.
[YARN-6606 |https://issues.apache.org/jira/browse/YARN-6606]

> Support making additional 'LocalResources' available to running containers
> --
>
> Key: YARN-1503
> URL: https://issues.apache.org/jira/browse/YARN-1503
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Jian He
> Attachments: Continuous-resource-localization.pdf
>
>
> We have a use case, where additional resources (jars, libraries etc) need to 
> be made available to an already running container. Ideally, we'd like this to 
> be done via YARN (instead of having potentially multiple containers per node 
> download resources on their own).
> Proposal:
>   NM to support an additional API where a list of resources can be specified. 
> Something like "localiceResource(ContainerId, Map)
>   NM would also require an additional API to get state for these resources - 
> "getLocalizationState(ContainerId)" - which returns the current state of all 
> local resources for the specified container(s).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6606) The implementation of LocalizationStatus in ContainerStatusProto

2017-05-16 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-6606:
--
Attachment: YARN-6606.1.patch

add the YARN-6606.1.patch

> The implementation of LocalizationStatus in ContainerStatusProto
> 
>
> Key: YARN-6606
> URL: https://issues.apache.org/jira/browse/YARN-6606
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Bingxue Qiu
> Fix For: 2.9.0
>
> Attachments: YARN-6606.1.patch
>
>
> we have a use case, where the full implementation of localization status in 
> ContainerStatusProto 
> [Continuous-resource-localization|https://issues.apache.org/jira/secure/attachment/12825041/Continuous-resource-localization.pdf]
>need to be done , so we make it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6606) The implementation of LocalizationStatus in ContainerStatusProto

2017-05-16 Thread Bingxue Qiu (JIRA)
Bingxue Qiu created YARN-6606:
-

 Summary: The implementation of LocalizationStatus in 
ContainerStatusProto
 Key: YARN-6606
 URL: https://issues.apache.org/jira/browse/YARN-6606
 Project: Hadoop YARN
  Issue Type: Task
  Components: nodemanager
Affects Versions: 2.9.0
Reporter: Bingxue Qiu


we have a use case, where the full implementation of localization status in 
ContainerStatusProto 
[Continuous-resource-localization|https://issues.apache.org/jira/secure/attachment/12825041/Continuous-resource-localization.pdf]
   need to be done , so we make it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3881) Writing RM cluster-level metrics

2016-11-18 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15676313#comment-15676313
 ] 

Bingxue Qiu commented on YARN-3881:
---

 Hi [~zjshen], I haven't find the totalVirtualCores / totalMB of cluster 
metrics in the metrics.json,  maybe it's necessary to show the water line 
trends when the nodes changes, just like add nodes or nodes fail?

> Writing RM cluster-level metrics
> 
>
> Key: YARN-3881
> URL: https://issues.apache.org/jira/browse/YARN-3881
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Zhijie Shen
>Assignee: Zhijie Shen
>  Labels: YARN-5355
> Attachments: metrics.json
>
>
> RM has a bunch of metrics that we may want to write into the timeline backend 
> to. I attached the metrics.json that I've crawled via 
> {{http://localhost:8088/jmx?qry=Hadoop:*}}. IMHO, we need to pay attention to 
> three groups of metrics:
> 1. QueueMetrics
> 2. JvmMetrics
> 3. ClusterMetrics
> The problem is that unlike other metrics belongs to a single application, 
> these ones belongs to RM or cluster-wide. Therefore, current write path is 
> not going to work for these metrics because they don't have the associated 
> user/flow/app context info. We need to rethink of modeling cross-app metrics 
> and the api to handle them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-17 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15676080#comment-15676080
 ] 

Bingxue Qiu commented on YARN-5814:
---

Thanks [~sjlee0] for your Suggestions!

On the Druid reader side, queries are based on the Drill.  So the conditions 
like filter list can supported by self-join,left-join. such as:

{code}
select F.* FROM druid.timeline_service_app F, druid.timeline_service_app S 
WHERE F.appId = S.appId AND F.startTime > 1479440083000 AND S.finishTime > 0 
AND F.appId = 'application_1476875405903_49989';
{code}

I also feel deeply grateful that you reminding me the new issues,  druid 
support order by column,  maybe  add a column named "idPrefix" make sense?

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-16 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15670054#comment-15670054
 ] 

Bingxue Qiu commented on YARN-5814:
---

Thanks [~gtCarrera9] for your suggestions:

1. For the writer design issues, We have implemented writer by kafka and a mr 
job (for HA) pull data to the realtime nodes of druid. But I'm not so sure this 
method is also fit for others. After all tranquility is more simple.  I will 
give the design of them later. we can choose to implement one or both of them.

2. For the table design, it may not be fit for using timeline.entity table to 
hold general timeline entities including container data in druid 
implementation. In HBase implementation, we can store general timeline entities 
with column family in entity table and scan them by rowkey. But druid is fixed 
schema column storage, if we need ad-hoc/agg in real-time, timeline.entity 
table maybe a wide table with many columns. It would bring the data redundancy  
and generate many rows and increase cache miss. That's why we consider to add 
these tables but not timeline.entity
Please feel free to give your suggestions. Thanks!


>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-11 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15656598#comment-15656598
 ] 

Bingxue Qiu commented on YARN-5814:
---

I have uploaded the design. It  contains our ideas about druid writer, reader 
and schema. Please feel free to give your suggestions. Thanks

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-10 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-5814:
--
Attachment: Add-Druid-in-YARN-Timeline-Service.pdf

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-10 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-5814:
--
Attachment: (was: Add-Druid-in-YARN-Timeline-Service.pdf)

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-10 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-5814:
--
Attachment: Add-Druid-in-YARN-Timeline-Service.pdf

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-04 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635734#comment-15635734
 ] 

Bingxue Qiu commented on YARN-5814:
---

Thanks [~djp], [~sjlee0] for your support!
I will give a more concrete design next week


>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-04 Thread Bingxue Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bingxue Qiu updated YARN-5814:
--
Comment: was deleted

(was: Thanks [~djp], [~sjlee0] for your support!
I will give a more concrete design next week



)

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-04 Thread Bingxue Qiu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15635730#comment-15635730
 ] 

Bingxue Qiu commented on YARN-5814:
---

Thanks [~djp], [~sjlee0] for your support!
I will give a more concrete design next week





>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org