[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-20 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29409&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29409
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 20/Sep/16 17:31
Start Date: 20/Sep/16 17:31
Worklog Time Spent: 10m 
  Work Description: Github user atsci commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
FreeBSD build *successful*! See 
https://ci.trafficserver.apache.org/job/Github-FreeBSD/841/ for details.
 



Issue Time Tracking
---

Worklog Id: (was: 29409)
Time Spent: 2h 50m  (was: 2h 40m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-20 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29408&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29408
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 20/Sep/16 17:21
Start Date: 20/Sep/16 17:21
Worklog Time Spent: 10m 
  Work Description: Github user jpeach closed the pull request at:

https://github.com/apache/trafficserver/pull/1028


Issue Time Tracking
---

Worklog Id: (was: 29408)
Time Spent: 2h 40m  (was: 2.5h)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-20 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29402&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29402
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 20/Sep/16 15:24
Start Date: 20/Sep/16 15:24
Worklog Time Spent: 10m 
  Work Description: Github user jpeach commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
@gtenev This looks good. Can you please squash the branch?


Issue Time Tracking
---

Worklog Id: (was: 29402)
Time Spent: 2.5h  (was: 2h 20m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-20 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29390
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 20/Sep/16 13:36
Start Date: 20/Sep/16 13:36
Worklog Time Spent: 10m 
  Work Description: Github user zwoop commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
@jpeach we ok to land this now?


Issue Time Tracking
---

Worklog Id: (was: 29390)
Time Spent: 2h 20m  (was: 2h 10m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29353&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29353
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 22:39
Start Date: 19/Sep/16 22:39
Worklog Time Spent: 10m 
  Work Description: Github user atsci commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
Linux build *successful*! See 
https://ci.trafficserver.apache.org/job/Github-Linux/731/ for details.
 



Issue Time Tracking
---

Worklog Id: (was: 29353)
Time Spent: 2h 10m  (was: 2h)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29352&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29352
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 22:38
Start Date: 19/Sep/16 22:38
Worklog Time Spent: 10m 
  Work Description: Github user atsci commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
FreeBSD build *successful*! See 
https://ci.trafficserver.apache.org/job/Github-FreeBSD/835/ for details.
 



Issue Time Tracking
---

Worklog Id: (was: 29352)
Time Spent: 2h  (was: 1h 50m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29351&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29351
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 22:29
Start Date: 19/Sep/16 22:29
Worklog Time Spent: 10m 
  Work Description: Github user gtenev commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
@jpeach, renamed "offline" flag to "online", added some reasoning about why 
the flag was necessary in the last commit description.


Issue Time Tracking
---

Worklog Id: (was: 29351)
Time Spent: 1h 50m  (was: 1h 40m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29345&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29345
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 20:59
Start Date: 19/Sep/16 20:59
Worklog Time Spent: 10m 
  Work Description: Github user gtenev commented on a diff in the pull 
request:

https://github.com/apache/trafficserver/pull/1028#discussion_r79487888
  
--- Diff: iocore/cache/Cache.cc ---
@@ -2000,6 +2000,12 @@ CacheProcessor::mark_storage_offline(CacheDisk *d 
///< Target disk
   uint64_t total_dir_delete   = 0;
   uint64_t used_dir_delete= 0;
 
+  /* Don't mark it again, it will invalidate the stats! */
+  if (d->offline) {
+return this->has_online_storage();
+  }
+  d->offline = true;
--- End diff --

@jpeach, great! sure, I can rename the flag to "online" :)


Issue Time Tracking
---

Worklog Id: (was: 29345)
Time Spent: 1h 40m  (was: 1.5h)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29344
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 20:52
Start Date: 19/Sep/16 20:52
Worklog Time Spent: 10m 
  Work Description: Github user jpeach commented on a diff in the pull 
request:

https://github.com/apache/trafficserver/pull/1028#discussion_r79486346
  
--- Diff: iocore/cache/Cache.cc ---
@@ -2000,6 +2000,12 @@ CacheProcessor::mark_storage_offline(CacheDisk *d 
///< Target disk
   uint64_t total_dir_delete   = 0;
   uint64_t used_dir_delete= 0;
 
+  /* Don't mark it again, it will invalidate the stats! */
+  if (d->offline) {
+return this->has_online_storage();
+  }
+  d->offline = true;
--- End diff --

@gtenev and I discussed this. The problem is that in the common case, the 
disk is already bad when ``mark_storage_offline`` is called, so we can't depend 
on the good->bad state transition to know when to update the accounting.

@gtenev This looks fine to me, but I'd suggest calling the flag ``online`` 
so that we avoid the double negatives.


Issue Time Tracking
---

Worklog Id: (was: 29344)
Time Spent: 1.5h  (was: 1h 20m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29336&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29336
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 20:09
Start Date: 19/Sep/16 20:09
Worklog Time Spent: 10m 
  Work Description: Github user jpeach commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
@gtenev If i'm reading your patch correctly, it adds the ``offline`` flag 
such that disks are marked bad *and* offline. That doesn't sound like what you 
intended from the description above.


Issue Time Tracking
---

Worklog Id: (was: 29336)
Time Spent: 1h 20m  (was: 1h 10m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29325&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29325
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 18:51
Start Date: 19/Sep/16 18:51
Worklog Time Spent: 10m 
  Work Description: Github user gtenev commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
@jpeach, appreciate your feedback!

It felt that "disk being offline" (might be an operator's decision) and 
"disk being bad" (number of IO errors reached a threshold) are better kept 
separate in general.
 
IMHO using `CacheDisk::num_errors` to mark the disk offline could be error 
prone and here is an example.

Let us say ``proxy.config.cache.max_disk_errors=5`` and a disk keeps 
failing causing ``handle_disk_failure()`` to be called and at some point 
``CacheDisk::num_errors`` becomes ``5``  which causes 
``mark_storage_offline()`` to be called. 

At this point since ``CacheDisk::num_errors=5`` then ``true==DISK_BAD(d)``.

It seems that if I did ``if(!DISK_BAD(d)) {...}`` (as suggested above) it 
would not execute the code in ``mark_storage_offline()`` at all, for instance 
``proxy.process.cache.bytes_total_stat`` would not get updated as it should.

This is one of my first adventures in the "cache"component so I hope I am 
not missing something, please let me know what you think and will gladly 
look/test/change as necessary. 







Issue Time Tracking
---

Worklog Id: (was: 29325)
Time Spent: 1h 10m  (was: 1h)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-19 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29318&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29318
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 19/Sep/16 15:59
Start Date: 19/Sep/16 15:59
Worklog Time Spent: 10m 
  Work Description: Github user jpeach commented on a diff in the pull 
request:

https://github.com/apache/trafficserver/pull/1028#discussion_r79424573
  
--- Diff: iocore/cache/Cache.cc ---
@@ -2000,6 +2000,12 @@ CacheProcessor::mark_storage_offline(CacheDisk *d 
///< Target disk
   uint64_t total_dir_delete   = 0;
   uint64_t used_dir_delete= 0;
 
+  /* Don't mark it again, it will invalidate the stats! */
+  if (d->offline) {
+return this->has_online_storage();
+  }
+  d->offline = true;
--- End diff --

Why do yo introduce a new flag rather than making the code conditional on 
the ``DISK_BAD`` check? e.g.
```C
if (!DISK_BAD(d)) {
  SET_DISK_BAD(d);

  // Do all the other stuff ...
}
```


Issue Time Tracking
---

Worklog Id: (was: 29318)
Time Spent: 1h  (was: 50m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29222&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29222
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 15/Sep/16 22:27
Start Date: 15/Sep/16 22:27
Worklog Time Spent: 10m 
  Work Description: Github user gtenev commented on a diff in the pull 
request:

https://github.com/apache/trafficserver/pull/1028#discussion_r79075679
  
--- Diff: iocore/cache/P_CacheDisk.h ---
@@ -97,6 +97,7 @@ struct CacheDisk : public Continuation {
   int num_errors;
   int cleared;
   bool read_only_p;
+  bool offline; /* flag marking cache disk offline (because of too many 
failures or by the operator). */
--- End diff --

This is another review tests (per jpeach's request). "Start a review"



Issue Time Tracking
---

Worklog Id: (was: 29222)
Time Spent: 50m  (was: 40m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29221&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29221
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 15/Sep/16 22:26
Start Date: 15/Sep/16 22:26
Worklog Time Spent: 10m 
  Work Description: Github user gtenev commented on a diff in the pull 
request:

https://github.com/apache/trafficserver/pull/1028#discussion_r79075616
  
--- Diff: iocore/cache/P_CacheDisk.h ---
@@ -97,6 +97,7 @@ struct CacheDisk : public Continuation {
   int num_errors;
   int cleared;
   bool read_only_p;
+  bool offline; /* flag marking cache disk offline (because of too many 
failures or by the operator). */
--- End diff --

This is another review tests (per jpeach's request). "Add single comment"


Issue Time Tracking
---

Worklog Id: (was: 29221)
Time Spent: 40m  (was: 0.5h)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29204&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29204
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 15/Sep/16 21:04
Start Date: 15/Sep/16 21:04
Worklog Time Spent: 10m 
  Work Description: Github user atsci commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
Linux build *successful*! See 
https://ci.trafficserver.apache.org/job/Github-Linux/713/ for details.
 



Issue Time Tracking
---

Worklog Id: (was: 29204)
Time Spent: 0.5h  (was: 20m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29203&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29203
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 15/Sep/16 21:03
Start Date: 15/Sep/16 21:03
Worklog Time Spent: 10m 
  Work Description: Github user atsci commented on the issue:

https://github.com/apache/trafficserver/pull/1028
  
FreeBSD build *successful*! See 
https://ci.trafficserver.apache.org/job/Github-FreeBSD/817/ for details.
 



Issue Time Tracking
---

Worklog Id: (was: 29203)
Time Spent: 20m  (was: 10m)

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work logged] (TS-4870) Storage can be marked offline multiple times which breaks related metrics

2016-09-15 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-4870?focusedWorklogId=29201&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-29201
 ]

ASF GitHub Bot logged work on TS-4870:
--

Author: ASF GitHub Bot
Created on: 15/Sep/16 20:50
Start Date: 15/Sep/16 20:50
Worklog Time Spent: 10m 
  Work Description: GitHub user gtenev opened a pull request:

https://github.com/apache/trafficserver/pull/1028

TS-4870 Avoid marking storage offline multiple times

Currently storage can be marked offline multiple times which breaks related 
metrics.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gtenev/trafficserver TS-4870

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/trafficserver/pull/1028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1028


commit b1389f36936bbfcee6ee645e9954eeae92d4e7ed
Author: Gancho Tenev 
Date:   2016-09-15T13:44:44Z

TS-4870 Avoid marking storage offline multiple times

Currently storage can be marked offline multiple times which breaks related 
metrics.




Issue Time Tracking
---

Worklog Id: (was: 29201)
Time Spent: 10m
Remaining Estimate: 0h

> Storage can be marked offline multiple times which breaks related metrics
> -
>
> Key: TS-4870
> URL: https://issues.apache.org/jira/browse/TS-4870
> Project: Traffic Server
>  Issue Type: Bug
>  Components: Cache, Metrics
>Reporter: Gancho Tenev
>Assignee: Gancho Tenev
> Fix For: 7.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Let us say traffic server is running with 2 disks
> {code}
> $ cat etc/trafficserver/storage.config
> /dev/sdb
> /dev/sdc
> $ sudo fdisk -l|grep 'Disk /dev/sd[b|c]'
> Disk /dev/sdb: 134 MB, 134217728 bytes
> Disk /dev/sdc: 134 MB, 134217728 bytes
> {code}
> Let us see what happens when we mark the same disk 3 times in a raw 
> ({{/dev/sdb}}) and check the {{proxy.node.cache.bytes_total}}.
> {code}
> # Initial cache size (when using both disks).
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 268025856
> # Take 1st disk offline. Cache size changes as expected.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 134012928
> # Take same disk offline again. Not good!
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total 0
> # Take same disk offline again. Negative value.
> $ sudo ./bin/traffic_ctl storage offline /dev/sdb
> $ ./bin/traffic_ctl metric get proxy.node.cache.bytes_total
> proxy.node.cache.bytes_total -134012928
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)