[jira] [Resolved] (STORM-3875) ThroughputVsLatency does not run on JDK11 due to specified TOPOLOGY_WORKER_GC_CHILDOPTS

2022-07-18 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3875.
-
Fix Version/s: 2.5.0
   Resolution: Fixed

> ThroughputVsLatency does not run on JDK11 due to specified 
> TOPOLOGY_WORKER_GC_CHILDOPTS
> ---
>
> Key: STORM-3875
> URL: https://issues.apache.org/jira/browse/STORM-3875
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (STORM-3875) ThroughputVsLatency does not run on JDK11 due to specified TOPOLOGY_WORKER_GC_CHILDOPTS

2022-07-12 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3875:
---

 Summary: ThroughputVsLatency does not run on JDK11 due to 
specified TOPOLOGY_WORKER_GC_CHILDOPTS
 Key: STORM-3875
 URL: https://issues.apache.org/jira/browse/STORM-3875
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (STORM-3863) tirupathi trip from chennai

2022-05-16 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch closed STORM-3863.
---
Resolution: Invalid

> tirupathi trip from chennai
> ---
>
> Key: STORM-3863
> URL: https://issues.apache.org/jira/browse/STORM-3863
> Project: Apache Storm
>  Issue Type: Bug
>  Components: trident
>Affects Versions: 1.2.1
>Reporter: Padmavathi Travels
>Priority: Trivial
> Fix For: 2.2.0
>
>
> Padmavathi Travels T.Nagar Provides Chennai to tirupati Car Packages and 
> Services at best Price. *Our Tirupati Tour Package by car* includes all the 
> customer requirments, We are operating Daily Tirupati Balaji Darshan from 
> Chennai for more than 23+ years. Padmavathi Travels chennai is considered as 
> one of the best travel agents in chennai.
> https://padmavathitravels.com/index.amp.shtml



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (STORM-3862) HdfsBlobStoreImpl should check permission after mkdirs

2022-05-16 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3862.
-
Fix Version/s: 2.5.0
   Resolution: Fixed

> HdfsBlobStoreImpl should check permission after mkdirs
> --
>
> Key: STORM-3862
> URL: https://issues.apache.org/jira/browse/STORM-3862
> Project: Apache Storm
>  Issue Type: Bug
>  Components: blobstore
>Affects Versions: 2.4.0
>Reporter: Zhang Dongsheng
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HdfsBlobStoreImpl and HdfsBlobStoreFile will create directory with 700 
> permission, we need to check if permission is set as expected. Because of the 
> influence of settings such as umask, we need to check whether the permissions 
> are set as expected. If not, we should give them the correct permissions to 
> ensure subsequent normal operation.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (STORM-3861) Upgrade clojure-maven-plugin

2022-05-04 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3861.
-
Fix Version/s: 2.5.0
   Resolution: Fixed

> Upgrade clojure-maven-plugin
> 
>
> Key: STORM-3861
> URL: https://issues.apache.org/jira/browse/STORM-3861
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I wasted a lot of time trying to figure out a build failure on a new 
> environment (on two separate occasions) due to the clojure plugin swallowing 
> an exception.  I had submitted this improvement, which makes the errors 
> debuggable.  It should be available now.
> [https://github.com/talios/clojure-maven-plugin/pull/112]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (STORM-3861) Upgrade clojure-maven-plugin

2022-04-28 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3861:
---

 Summary: Upgrade clojure-maven-plugin
 Key: STORM-3861
 URL: https://issues.apache.org/jira/browse/STORM-3861
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


I wasted a lot of time trying to figure out a build failure on a new 
environment (on two separate occasions) due to the clojure plugin swallowing an 
exception.  I had submitted this improvement, which makes the errors 
debuggable.  It should be available now.

[https://github.com/talios/clojure-maven-plugin/pull/112]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (STORM-3838) prevent topology from overriding STORM_WORKERS_ARTIFACTS_DIR

2022-03-29 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3838:
---

 Summary: prevent topology from overriding 
STORM_WORKERS_ARTIFACTS_DIR
 Key: STORM-3838
 URL: https://issues.apache.org/jira/browse/STORM-3838
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


A user overrode this and EventLoggerBolt throws an exception, preventing 
workers from coming up.  There should be no reason for a user to set this value.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (STORM-3830) exclude all old log4j

2022-03-28 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch closed STORM-3830.
---
Resolution: Duplicate

> exclude all old log4j
> -
>
> Key: STORM-3830
> URL: https://issues.apache.org/jira/browse/STORM-3830
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (STORM-3835) Log when shell command exceptions occur

2022-03-28 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3835:
---

 Summary: Log when shell command exceptions occur
 Key: STORM-3835
 URL: https://issues.apache.org/jira/browse/STORM-3835
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


When the numShellExceptions meter increments, it would be nice to see what 
command failed and what exception caused the problem.  

 

We saw this internally trigger when LDAP servers were having issues.  Knowing 
the command would help narrow down the problem faster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (STORM-3831) exclude all old log4j

2022-03-09 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3831.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> exclude all old log4j
> -
>
> Key: STORM-3831
> URL: https://issues.apache.org/jira/browse/STORM-3831
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (STORM-3828) upgrade org/glassfish/javax.el due to build problems

2022-03-07 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3828.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> upgrade org/glassfish/javax.el due to build problems
> 
>
> Key: STORM-3828
> URL: https://issues.apache.org/jira/browse/STORM-3828
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: PJ Fanning
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code:java}
> [ERROR] Failed to execute goal on project storm-autocreds: Could not resolve 
> dependencies for project org.apache.storm:storm-autocreds:jar:2.4.1-SNAPSHOT: 
> Failed to collect dependencies at org.apache.hbase:hbase-server:jar:2.1.3 -> 
> org.glassfish.web:javax.servlet.jsp:jar:2.3.2 -> 
> org.glassfish:javax.el:jar:3.0.1-b06-SNAPSHOT: Failed to read artifact 
> descriptor for org.glassfish:javax.el:jar:3.0.1-b06-SNAPSHOT: Failure to 
> transfer org.glassfish:javax.el:pom:3.0.1-b06-SNAPSHOT from 
> https://maven.java.net/content/repositories/snapshots was cached in the local 
> repository, resolution will not be reattempted until the update interval of 
> jvnet-nexus-snapshots has elapsed or updates are forced. Original error: 
> Could not transfer artifact org.glassfish:javax.el:pom:3.0.1-b06-SNAPSHOT 
> from/to jvnet-nexus-snapshots 
> (https://maven.java.net/content/repositories/snapshots): Transfer failed for 
> https://maven.java.net/content/repositories/snapshots/org/glassfish/javax.el/3.0.1-b06-SNAPSHOT/javax.el-3.0.1-b06-SNAPSHOT.pom
>  -> [Help 1] {code}
> [https://app.travis-ci.com/github/apache/storm/jobs/561567903]
> Also seems like a bad idea to be relying on 3.0.1-b06-SNAPSHOT
> Similar issue - https://issues.apache.org/jira/browse/JCR-4626
> My workaround is based on this



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (STORM-3831) exclude all old log4j

2022-03-04 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3831:
---

 Summary: exclude all old log4j
 Key: STORM-3831
 URL: https://issues.apache.org/jira/browse/STORM-3831
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (STORM-3830) exclude all old log4j

2022-03-04 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3830:
---

 Summary: exclude all old log4j
 Key: STORM-3830
 URL: https://issues.apache.org/jira/browse/STORM-3830
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (STORM-3821) use commons-compress 1.21 due to security issues

2022-02-28 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3821.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> use commons-compress 1.21 due to security issues
> 
>
> Key: STORM-3821
> URL: https://issues.apache.org/jira/browse/STORM-3821
> Project: Apache Storm
>  Issue Type: Dependency upgrade
>Reporter: PJ Fanning
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Part of https://issues.apache.org/jira/browse/STORM-3592
> See vulnerabilities in 
> https://mvnrepository.com/artifact/org.apache.commons/commons-compress/1.18



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (STORM-3824) upgrade httpclient due to security issues

2022-02-28 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3824.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> upgrade httpclient due to security issues
> -
>
> Key: STORM-3824
> URL: https://issues.apache.org/jira/browse/STORM-3824
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: PJ Fanning
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Relates to https://issues.apache.org/jira/browse/STORM-3592



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (STORM-3817) Upgrading to Zookeeper 3.5.x, 3.6.x or 3.7.x

2022-02-07 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3817.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> Upgrading to Zookeeper 3.5.x, 3.6.x or 3.7.x
> 
>
> Key: STORM-3817
> URL: https://issues.apache.org/jira/browse/STORM-3817
> Project: Apache Storm
>  Issue Type: Dependency upgrade
>Affects Versions: 2.3.0, 2.2.1
>Reporter: Richard Zowalla
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Is there any possibility to upgrade the [shaded zookeeper version 
> |https://github.com/apache/storm/blob/master/storm-shaded-deps/pom.xml#L64] 
> from 3.4.14 to a newer version? Or are there any reasons for not doing an 
> upgrade right now?
> I am doing some testing with Storm in a Java 17 environment and it looks like 
> I am suffering from this Zookeeper specific issue present in 3.4.14: 
> https://issues.apache.org/jira/browse/ZOOKEEPER-3779
> If necessary I can also provide a PR for an upgrade to 3.5.x, 3.6.x or 3.7.x
> UPDATE: Looks like curator depends on 3.5.x - so probably 3.5.x should be an 
> option.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (STORM-3815) allow option to disable sending of __send-iconnection metrics

2022-01-04 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3815.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> allow option to disable sending of __send-iconnection metrics
> -
>
> Key: STORM-3815
> URL: https://issues.apache.org/jira/browse/STORM-3815
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The __send-iconnection metrics can be substantial for large topologies and 
> users may not care about them.  Add an option to allow disable their 
> reporting.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (STORM-3815) allow option to disable sending of __send-iconnection metrics

2022-01-03 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3815:
---

 Summary: allow option to disable sending of __send-iconnection 
metrics
 Key: STORM-3815
 URL: https://issues.apache.org/jira/browse/STORM-3815
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


The __send-iconnection metrics can be substantial for large topologies and 
users may not care about them.  Add an option to allow disable their reporting.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (STORM-3811) update log4j

2021-12-16 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3811:
---

 Summary: update log4j
 Key: STORM-3811
 URL: https://issues.apache.org/jira/browse/STORM-3811
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (STORM-3804) Don't allow deleting blobs if they are required for an active topology

2021-11-03 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3804.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> Don't allow deleting blobs if they are required for an active topology
> --
>
> Key: STORM-3804
> URL: https://issues.apache.org/jira/browse/STORM-3804
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: blobstore
>Reporter: Nikhil Singh
>Assignee: Nikhil Singh
>Priority: Minor
> Fix For: 2.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Don't allow deleting blobs if they are required for an active topology. Throw 
> an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3802) Allow adding metrics reporters to all topologies

2021-10-25 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3802.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> Allow adding metrics reporters to all topologies
> 
>
> Key: STORM-3802
> URL: https://issues.apache.org/jira/browse/STORM-3802
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We would like to be able to track some topology-specific metrics for all 
> topologies, regardless of how a topology configures their metrics reporters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3802) Allow adding metrics reporters to all topologies

2021-10-22 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3802:
---

 Summary: Allow adding metrics reporters to all topologies
 Key: STORM-3802
 URL: https://issues.apache.org/jira/browse/STORM-3802
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


We would like to be able to track some topology-specific metrics for all 
topologies, regardless of how a topology configures their metrics reporters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3801) newWorkerEvent doesn't report properly for multiple reporters

2021-10-20 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3801.
-
Fix Version/s: 2.4.0
   Resolution: Fixed

> newWorkerEvent doesn't report properly for multiple reporters
> -
>
> Key: STORM-3801
> URL: https://issues.apache.org/jira/browse/STORM-3801
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Add a get and reset functionality that works for multiple reporters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3801) newWorkerEvent doesn't report properly for multiple reporters

2021-10-19 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3801:
---

 Summary: newWorkerEvent doesn't report properly for multiple 
reporters
 Key: STORM-3801
 URL: https://issues.apache.org/jira/browse/STORM-3801
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


Add a get and reset functionality that works for multiple reporters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3793) Add metric to track backpressure status for a task

2021-09-08 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3793:
---

 Summary: Add metric to track backpressure status for a task
 Key: STORM-3793
 URL: https://issues.apache.org/jira/browse/STORM-3793
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3791) update metric documentation

2021-08-18 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3791.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> update metric documentation
> ---
>
> Key: STORM-3791
> URL: https://issues.apache.org/jira/browse/STORM-3791
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Various changes to metrics to V2 have occurred.  Make a sweep and try and 
> update documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3791) update metric documentation

2021-08-13 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3791:
---

 Summary: update metric documentation
 Key: STORM-3791
 URL: https://issues.apache.org/jira/browse/STORM-3791
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


Various changes to metrics to V2 have occurred.  Make a sweep and try and 
update documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3790) Add meter to track failures WorkerTokenAuthorizer getPassword

2021-08-13 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3790.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Add meter to track failures WorkerTokenAuthorizer getPassword
> -
>
> Key: STORM-3790
> URL: https://issues.apache.org/jira/browse/STORM-3790
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3790) Add meter to track failures WorkerTokenAuthorizer getPassword

2021-08-11 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3790:
---

 Summary: Add meter to track failures WorkerTokenAuthorizer 
getPassword
 Key: STORM-3790
 URL: https://issues.apache.org/jira/browse/STORM-3790
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3786) V2 metrics tick may overreport or not report at all

2021-07-30 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3786.
-
Resolution: Fixed

> V2 metrics tick may overreport or not report at all
> ---
>
> Key: STORM-3786
> URL: https://issues.apache.org/jira/browse/STORM-3786
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> V2 metrics tick should report only at a specific interval.  It also may not 
> be triggered if no v1 metrics exist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (STORM-3786) V2 metrics tick may overreport or not report at all

2021-07-30 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch updated STORM-3786:

Fix Version/s: 2.3.0

> V2 metrics tick may overreport or not report at all
> ---
>
> Key: STORM-3786
> URL: https://issues.apache.org/jira/browse/STORM-3786
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> V2 metrics tick should report only at a specific interval.  It also may not 
> be triggered if no v1 metrics exist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (STORM-3784) my supervisor will shut down on 2:00 am everyday

2021-07-29 Thread Aaron Gresch (Jira)


[ 
https://issues.apache.org/jira/browse/STORM-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390102#comment-17390102
 ] 

Aaron Gresch commented on STORM-3784:
-

Something is deleting the file 
/data/apache-storm-2.1.0/status/supervisor/stormdist/PradarLinkTopology-4-1626751925/stormconf.ser

> my supervisor will shut down on 2:00 am everyday
> 
>
> Key: STORM-3784
> URL: https://issues.apache.org/jira/browse/STORM-3784
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-server
>Affects Versions: 2.1.0
> Environment: centos 7 x64
>Reporter: Sunsy Sun
>Priority: Major
> Attachments: supervisor(1).log
>
>
> The cluster has one nimbus and two supervisors.one of the supervisors is 
> alone with nimbus.
> I deployed two topology that PradarLinkTopology and PradarLogTopology.
> PradarLogTopology run with 4 workers.PradarLinkTopology run with 1 workers.
> on 2:00 am everyday, all supervisors will shut down,i havn't find out the 
> reason.
> I try to clean up the status directory,but the problem still exsit.
> this is my supervisor.log
> {code:java}
> //代码占位符
> 2021-07-21 02:03:42.070 o.a.s.u.Utils Thread-17 [INFO] Worker Process 
> dcae9231-4be4-4842-9ed0-988e1b8a2b28:Error occurred during initialization of 
> VM2021-07-21 02:03:42.070 o.a.s.u.Utils Thread-17 [INFO] Worker Process 
> dcae9231-4be4-4842-9ed0-988e1b8a2b28:Error occurred during initialization of 
> VM2021-07-21 02:03:42.071 o.a.s.u.Utils Thread-17 [INFO] Worker Process 
> dcae9231-4be4-4842-9ed0-988e1b8a2b28:java.lang.Error: Properties init: Could 
> not determine current working directory.2021-07-21 02:03:42.071 o.a.s.u.Utils 
> Thread-17 [INFO] Worker Process dcae9231-4be4-4842-9ed0-988e1b8a2b28: at 
> java.lang.System.initProperties(Native Method)2021-07-21 02:03:42.071 
> o.a.s.u.Utils Thread-17 [INFO] Worker Process 
> dcae9231-4be4-4842-9ed0-988e1b8a2b28: at 
> java.lang.System.initializeSystemClass(System.java:1166)2021-07-21 
> 02:03:42.071 o.a.s.u.Utils Thread-17 [INFO] Worker Process 
> dcae9231-4be4-4842-9ed0-988e1b8a2b28:2021-07-21 02:03:42.323 
> o.a.s.d.s.BasicContainer SLOT_6702 [INFO] Removed Worker ID 
> dcae9231-4be4-4842-9ed0-988e1b8a2b282021-07-21 02:03:42.329 o.a.s.d.s.Slot 
> SLOT_6702 [INFO] STATE kill msInState: 68588 
> topo:PradarLogTopology-3-1626751922 worker:null -> empty msInState: 
> 32021-07-21 02:03:42.329 o.a.s.d.s.Slot SLOT_6702 [INFO] SLOT 6702: Changing 
> current assignment from 
> LocalAssignment(topology_id:PradarLogTopology-3-1626751922, 
> executors:[ExecutorInfo(task_start:4, task_end:4), ExecutorInfo(task_start:1, 
> task_end:1)], resources:WorkerResources(mem_on_heap:256.0, mem_off_heap:0.0, 
> cpu:20.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, 
> resources:{offheap.memory.mb=0.0, onheap.memory.mb=256.0, 
> cpu.pcore.percent=20.0}, shared_resources:{}), owner:root) to null2021-07-21 
> 02:03:42.353 o.a.s.d.s.Supervisor pool-10-thread-1 [WARN] Topology config is 
> not localized yet...2021-07-21 02:03:42.449 o.a.s.d.s.Slot SLOT_6700 [INFO] 
> SLOT 6700 all processes are dead...2021-07-21 02:03:42.449 
> o.a.s.d.s.Container SLOT_6700 [INFO] Cleaning up 
> 8cbbfd6c-961b-482d-9175-cf9b79473808-172.26.137.86:b7963273-452a-43af-bc00-d814e0629f962021-07-21
>  02:03:42.450 o.a.s.d.s.Container SLOT_6700 [INFO] GET worker-user for 
> b7963273-452a-43af-bc00-d814e0629f962021-07-21 02:03:42.450 
> o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path 
> /data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/pids/163262021-07-21
>  02:03:43.322 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path 
> /data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad8/pids2021-07-21
>  02:03:43.322 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path 
> /data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad8/tmp2021-07-21
>  02:03:45.209 o.a.s.d.s.BasicContainer Thread-17 [INFO] Worker Process 
> dcae9231-4be4-4842-9ed0-988e1b8a2b28 exited with code: 12021-07-21 
> 02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path 
> /data/apache-storm-2.1.0/status/workers/26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21
>  02:03:45.224 o.a.s.d.s.Supervisor pool-10-thread-7 [WARN] Topology config is 
> not localized yet...2021-07-21 02:03:45.224 o.a.s.d.s.Container SLOT_6701 
> [INFO] REMOVE worker-user 26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21 
> 02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6700 [INFO] Deleting path 
> /data/apache-storm-2.1.0/status/workers/b7963273-452a-43af-bc00-d814e0629f96/heartbeats2021-07-21
>  02:03:45.224 o.a.s.d.s.AdvancedFSOps SLOT_6701 [INFO] Deleting path 
> /data/apache-storm-2.1.0/status/workers-users/26b5ffbd-08b6-46df-aa04-6b86f78b8ad82021-07-21
>  02:03:45.224 o.a.s.t.ProcessFunction 

[jira] [Created] (STORM-3786) V2 metrics tick may overreport or not report at all

2021-07-29 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3786:
---

 Summary: V2 metrics tick may overreport or not report at all
 Key: STORM-3786
 URL: https://issues.apache.org/jira/browse/STORM-3786
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


V2 metrics tick should report only at a specific interval.  It also may not be 
triggered if no v1 metrics exist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3737) Share Worker Metric Registry For Guice AOP Based Metrics Integeration

2021-07-29 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3737.
-
Resolution: Fixed

Thanks for the PR.

> Share Worker Metric Registry For Guice AOP Based Metrics Integeration
> -
>
> Key: STORM-3737
> URL: https://issues.apache.org/jira/browse/STORM-3737
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-client
>Affects Versions: 2.1.0
>Reporter: Lakshman Sai
>Priority: Minor
> Fix For: 2.3.0
>
>   Original Estimate: 1h
>  Time Spent: 0.5h
>  Remaining Estimate: 0.5h
>
> Metric Registry has been made private which makes it harder to integrate with 
> Guice based AOP metrics.
> Proposed solve is to add metric registry created in the worker to 
> SharedMetricRegistries so while intializing guice based AOP metrics it can be 
> done in worker hook
>  [https://github.com/palominolabs/metrics-guice]
>  
> PR:
> https://github.com/apache/storm/pull/3373



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3780) switch ErrorReportingMetrics to V2 API

2021-07-14 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3780.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> switch ErrorReportingMetrics to V2 API
> --
>
> Key: STORM-3780
> URL: https://issues.apache.org/jira/browse/STORM-3780
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3781) switch recv-iconnection to v2 metric api

2021-07-02 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3781:
---

 Summary: switch recv-iconnection to v2 metric api
 Key: STORM-3781
 URL: https://issues.apache.org/jira/browse/STORM-3781
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3780) switch ErrorReportingMetrics to V2 API

2021-07-01 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3780:
---

 Summary: switch ErrorReportingMetrics to V2 API
 Key: STORM-3780
 URL: https://issues.apache.org/jira/browse/STORM-3780
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3778) convert SpoutThrottlingMetrics to V2 API

2021-07-01 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3778.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> convert SpoutThrottlingMetrics to V2 API
> 
>
> Key: STORM-3778
> URL: https://issues.apache.org/jira/browse/STORM-3778
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3778) convert SpoutThrottlingMetrics to V2 API

2021-06-29 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3778:
---

 Summary: convert SpoutThrottlingMetrics to V2 API
 Key: STORM-3778
 URL: https://issues.apache.org/jira/browse/STORM-3778
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3775) topology.blobstore.map can cause supervisor restarts

2021-06-21 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3775.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> topology.blobstore.map can cause supervisor restarts
> 
>
> Key: STORM-3775
> URL: https://issues.apache.org/jira/browse/STORM-3775
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I noticed that setting a blobstore map config with booleans formatted as 
> strings would be accepted and cause the AsyncLocalizer to throw an exception 
> and cause supervisor restarts.  The config option should not be valid and 
> prevent being submitted.
>  
> topology.blobstore.map
> {  "blob1": {"localname": "test.tgz", "uncompress": "false"  }
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3774) Migrate Cgroup metrics to V2

2021-06-14 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3774.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Migrate Cgroup metrics to V2 
> -
>
> Key: STORM-3774
> URL: https://issues.apache.org/jira/browse/STORM-3774
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3775) topology.blobstore.map can cause supervisor restarts

2021-06-03 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3775:
---

 Summary: topology.blobstore.map can cause supervisor restarts
 Key: STORM-3775
 URL: https://issues.apache.org/jira/browse/STORM-3775
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


I noticed that setting a blobstore map config with booleans formatted as 
strings would be accepted and cause the AsyncLocalizer to throw an exception 
and cause supervisor restarts.  The config option should not be valid and 
prevent being submitted.

 
topology.blobstore.map
{  "blob1": {"localname": "test.tgz", "uncompress": "false"  }
}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (STORM-3773) Worker Reassignment - Difference between Storm 2.x and Storm 1.x

2021-06-03 Thread Aaron Gresch (Jira)


[ 
https://issues.apache.org/jira/browse/STORM-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17356518#comment-17356518
 ] 

Aaron Gresch commented on STORM-3773:
-

This sounds like it could be a dupe of STORM-3677.

> Worker Reassignment - Difference between Storm 2.x  and Storm 1.x
> -
>
> Key: STORM-3773
> URL: https://issues.apache.org/jira/browse/STORM-3773
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Surajeet
>Priority: Major
>
> We are currently on Storm 1.2.1 and was in the process of upgrading it to 
> Storm 2.2.0
>  Observed the below while upgrading it to 2.2.0:
> 1) In a storm cluster (4 nodes) with 8 topologies running  ( with a mapping 
> of 1-1 between worker and topologies), when i bring down nimbus,supervisor in 
> one of the node (let's say Node 1, which is not nimbus leader) the workers 
> running on that node gets reassigned to other 3, even though it is running on 
> that node (Node 1). So i have 2 worker process for the same topology running 
> at the same time ( saw the behaviour with or without using pacemaker). The 
> worker process does get killed when nimbus and supervisor is brought up in 
> Node 1
> 2) Observed from worker logs that it sends heartbeat to local supervisor and 
> nimbus leader , which with 1.2.1 used to happen using Zookeeper ( i saw this 
> behaviour in 2.2.0 with or without using Pacemaker). 
>  If i bring down nimbus and supervisor on node where nimbus is a leader, it 
> reassigns worker processes and in some cases leads to zombie worker 
> processess ( is not killed when storm kill is executed)
> These above behaviour (reassignment of worker) doesn't happen with Storm 1.2.1
> Since this is a fundamental design change between 1.x and 2.x , are there any 
> documentation which describes it in detail? ( couldn't find from Release 
> Notes)
> (I am raising this as a bug because its preventing us from moving to 2.2.0 
> due to the issue mentioned in 2) )
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (STORM-3760) Storm 2.2.0 not reporting newWorkerEvents metric

2021-06-01 Thread Aaron Gresch (Jira)


[ 
https://issues.apache.org/jira/browse/STORM-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17355260#comment-17355260
 ] 

Aaron Gresch commented on STORM-3760:
-

newWorkerEvent was converted to the V2 metrics API.  You should be able to see 
it by using the V2 reporters or setting 

topology.enable.v2.metrics.tick to true.

> Storm 2.2.0 not reporting newWorkerEvents metric
> 
>
> Key: STORM-3760
> URL: https://issues.apache.org/jira/browse/STORM-3760
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-metrics
>Affects Versions: 2.2.0
>Reporter: Cristian Rojas
>Priority: Major
>
> Hi everyone,
>   
>  We have recently migrated from Storm 0.10.0 to Storm 2.2.0, we have a custom 
> _StatsdMetricConsumer_ which implements _IMetricsConsumer_ interface. 
>   
>  Storm is still reporting some metrics (__transfer-count,_ _ack-count,_ 
> _metrics, etc)_ but it seems after the migration it stopped reporting 
> _*newWorkerEvents*_ metric. I made sure this is not a problem in our 
> implementation by logging all the metrics received _handleDataPoints_ method.
>   
>  Is this a known issue? any way to get that metric fixed?
> Best regards, thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3774) Migrate Cgroup metrics to V2

2021-06-01 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3774:
---

 Summary: Migrate Cgroup metrics to V2 
 Key: STORM-3774
 URL: https://issues.apache.org/jira/browse/STORM-3774
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3769) Failed adding references to blobs: FileNotFoundException

2021-05-03 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3769.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Failed adding references to blobs: FileNotFoundException
> 
>
> Key: STORM-3769
> URL: https://issues.apache.org/jira/browse/STORM-3769
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We hit a file not found exception with AsyncLocalizer:
> {code:java}
> 2021-04-23 17:39:13.380 o.a.s.l.AsyncLocalizer 
> ForkJoinPool.commonPool-worker-23 [ERROR] Failed adding references to blobs 
> for TimePortAndAssignment{xxx-1-15-1616201755 on 6708}
> java.io.FileNotFoundException: File 
> '/home/y/var/storm/supervisor/stormdist/xxx-1-15-1616201755/stormconf.ser' 
> does not exist
>         at 
> org.apache.storm.shade.org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:297)
>  ~[storm-shaded-deps-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.shade.org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851)
>  ~[storm-shaded-deps-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java:311)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.utils.ConfigUtils.readSupervisorStormConfImpl(ConfigUtils.java:472)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.utils.ConfigUtils.readSupervisorStormConf(ConfigUtils.java:306)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:368)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.addReferencesToBlobs(AsyncLocalizer.java:398)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.lambda$null$7(AsyncLocalizer.java:235)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1877) 
> ~[?:1.8.0_262]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.lambda$requestDownloadTopologyBlobs$8(AsyncLocalizer.java:229)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940)
>  [?:1.8.0_262]
>         at 
> java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457)
>  [?:1.8.0_262]
>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:163) 
> [?:1.8.0_262]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (STORM-3769) Failed adding references to blobs: FileNotFoundException

2021-04-29 Thread Aaron Gresch (Jira)


[ 
https://issues.apache.org/jira/browse/STORM-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17335791#comment-17335791
 ] 

Aaron Gresch commented on STORM-3769:
-

In this case, a worker for topology A was no longer assigned on a supervisor.  
At some point the blob cache exceeded the size limit.  When cleanup was called, 
1 or 2 of the topology A's blobs were deleted, and this was enough to get under 
the cache size limit.  Because some of the topology blobs remain, the code 
currently assumes the blobs all remain downloaded.  Then when a worker for 
topology A gets reassigned back to the node, this exception occurs and workers 
will be unable to start.

> Failed adding references to blobs: FileNotFoundException
> 
>
> Key: STORM-3769
> URL: https://issues.apache.org/jira/browse/STORM-3769
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
>
> We hit a file not found exception with AsyncLocalizer:
> {code:java}
> 2021-04-23 17:39:13.380 o.a.s.l.AsyncLocalizer 
> ForkJoinPool.commonPool-worker-23 [ERROR] Failed adding references to blobs 
> for TimePortAndAssignment{xxx-1-15-1616201755 on 6708}
> java.io.FileNotFoundException: File 
> '/home/y/var/storm/supervisor/stormdist/xxx-1-15-1616201755/stormconf.ser' 
> does not exist
>         at 
> org.apache.storm.shade.org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:297)
>  ~[storm-shaded-deps-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.shade.org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851)
>  ~[storm-shaded-deps-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java:311)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.utils.ConfigUtils.readSupervisorStormConfImpl(ConfigUtils.java:472)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.utils.ConfigUtils.readSupervisorStormConf(ConfigUtils.java:306)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:368)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.addReferencesToBlobs(AsyncLocalizer.java:398)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.lambda$null$7(AsyncLocalizer.java:235)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1877) 
> ~[?:1.8.0_262]
>         at 
> org.apache.storm.localizer.AsyncLocalizer.lambda$requestDownloadTopologyBlobs$8(AsyncLocalizer.java:229)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
>         at 
> java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940)
>  [?:1.8.0_262]
>         at 
> java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457)
>  [?:1.8.0_262]
>         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 
> [?:1.8.0_262]
>         at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:163) 
> [?:1.8.0_262]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3769) Failed adding references to blobs: FileNotFoundException

2021-04-29 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3769:
---

 Summary: Failed adding references to blobs: FileNotFoundException
 Key: STORM-3769
 URL: https://issues.apache.org/jira/browse/STORM-3769
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


We hit a file not found exception with AsyncLocalizer:
{code:java}
2021-04-23 17:39:13.380 o.a.s.l.AsyncLocalizer 
ForkJoinPool.commonPool-worker-23 [ERROR] Failed adding references to blobs for 
TimePortAndAssignment{xxx-1-15-1616201755 on 6708}
java.io.FileNotFoundException: File 
'/home/y/var/storm/supervisor/stormdist/xxx-1-15-1616201755/stormconf.ser' does 
not exist
        at 
org.apache.storm.shade.org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:297)
 ~[storm-shaded-deps-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.shade.org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851)
 ~[storm-shaded-deps-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.utils.ConfigUtils.readSupervisorStormConfGivenPath(ConfigUtils.java:311)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.utils.ConfigUtils.readSupervisorStormConfImpl(ConfigUtils.java:472)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.utils.ConfigUtils.readSupervisorStormConf(ConfigUtils.java:306)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.localizer.AsyncLocalizer.getLocalResources(AsyncLocalizer.java:368)
 ~[storm-server-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.localizer.AsyncLocalizer.addReferencesToBlobs(AsyncLocalizer.java:398)
 ~[storm-server-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.localizer.AsyncLocalizer.lambda$null$7(AsyncLocalizer.java:235)
 ~[storm-server-2.3.0.y.jar:2.3.0.y]
        at 
java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1877) 
~[?:1.8.0_262]
        at 
org.apache.storm.localizer.AsyncLocalizer.lambda$requestDownloadTopologyBlobs$8(AsyncLocalizer.java:229)
 ~[storm-server-2.3.0.y.jar:2.3.0.y]
        at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) 
[?:1.8.0_262]
        at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940)
 [?:1.8.0_262]
        at 
java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:457)
 [?:1.8.0_262]
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) 
[?:1.8.0_262]
        at 
java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) 
[?:1.8.0_262]
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) 
[?:1.8.0_262]
        at 
java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:163) 
[?:1.8.0_262]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3749) improve logging on server error in StormServerHandler

2021-03-08 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3749.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> improve logging on server error in StormServerHandler
> -
>
> Key: STORM-3749
> URL: https://issues.apache.org/jira/browse/STORM-3749
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3748) prevent concurrent modification when fetching v2 metrics

2021-03-05 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3748.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> prevent concurrent modification when fetching v2 metrics
> 
>
> Key: STORM-3748
> URL: https://issues.apache.org/jira/browse/STORM-3748
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> a user reported a ConcurrentModificationException when retrieving metric 
> names in 
> StormMetricRegistry getMetricNameMap().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3749) improve logging on server error in StormServerHandler

2021-03-04 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3749:
---

 Summary: improve logging on server error in StormServerHandler
 Key: STORM-3749
 URL: https://issues.apache.org/jira/browse/STORM-3749
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3748) prevent concurrent modification when fetching v2 metrics

2021-03-02 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3748:
---

 Summary: prevent concurrent modification when fetching v2 metrics
 Key: STORM-3748
 URL: https://issues.apache.org/jira/browse/STORM-3748
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


a user reported a ConcurrentModificationException when retrieving metric names 
in 

StormMetricRegistry getMetricNameMap().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3740) Asynchronous background blob download can cause orphaned blob references

2021-02-05 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3740.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Asynchronous background blob download can cause orphaned blob references
> 
>
> Key: STORM-3740
> URL: https://issues.apache.org/jira/browse/STORM-3740
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
>  
> We hit a path in the AsyncLocalizer where we found blob references being 
> added after a worker slot was removed.  Asynchronous blob downloads were not 
> canceled before removing the blob references.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3740) Asynchronous background blob download can cause orphaned blob references

2021-01-29 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3740:
---

 Summary: Asynchronous background blob download can cause orphaned 
blob references
 Key: STORM-3740
 URL: https://issues.apache.org/jira/browse/STORM-3740
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


 

We hit a path in the AsyncLocalizer where we found blob references being added 
after a worker slot was removed.  Asynchronous blob downloads were not canceled 
before removing the blob references.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3682) Upgrade netty client metrics to use V2 API

2021-01-19 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3682.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Upgrade netty client metrics to use V2 API
> --
>
> Key: STORM-3682
> URL: https://issues.apache.org/jira/browse/STORM-3682
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3736) remove topologyId and worker port from V2 metrics API

2021-01-15 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3736:
---

 Summary: remove topologyId and worker port from V2 metrics API
 Key: STORM-3736
 URL: https://issues.apache.org/jira/browse/STORM-3736
 Project: Apache Storm
  Issue Type: Improvement
Affects Versions: 2.3.0
Reporter: Aaron Gresch


the topologyId and port are now available to the StormMetricsRegistry and 
should be removed from the existing metric API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3724) Use blobstore dir modtime to avoid update lookups by HDFSBlobstore

2021-01-14 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3724.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Use blobstore dir modtime to avoid update lookups by HDFSBlobstore
> --
>
> Key: STORM-3724
> URL: https://issues.apache.org/jira/browse/STORM-3724
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> We have multiple storm clusters with 100's of supervisors polling for blob 
> updates.  This causes high load on our Hadoop namenodes that are also used by 
> multiple other clusters.
>  
> An improvement would be for the AsyncLocalizer to check the remote blobstore 
> last mod time once and then skip checking each individual blob if it was 
> already checked for the same mod time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3733) AsyncLocalizer stuck looking for missing topology

2021-01-12 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3733.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> AsyncLocalizer stuck looking for missing topology
> -
>
> Key: STORM-3733
> URL: https://issues.apache.org/jira/browse/STORM-3733
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code:java}
> 2020-11-09 20:18:12.325 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 2 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:18:43.744 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 2 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:19:14.726 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 2 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:19:46.148 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 2 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:20:16.560 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 0 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:20:47.990 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 0 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:21:19.403 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 0 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:21:50.818 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 0 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:22:21.257 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 1 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:22:52.668 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 1 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:23:24.082 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 1 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:23:55.512 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 1 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:24:25.919 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 2 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:24:57.343 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 2 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:25:28.773 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 2 [ERROR] Could not update blob, will retry again later
> 2020-11-09 20:16:09.659 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 
> 1 [ERROR] Could not update blob, will retry again later
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could 
> not download...
> at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> ~[?:1.8.0_262]
> at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) 
> ~[?:1.8.0_262]
> at 
> org.apache.storm.localizer.AsyncLocalizer.updateBlobs(AsyncLocalizer.java:333)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [?:1.8.0_262]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [?:1.8.0_262]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [?:1.8.0_262]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [?:1.8.0_262]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_262]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_262]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
> Caused by: java.lang.RuntimeException: Could not download...
> at 
> org.apache.storm.localizer.AsyncLocalizer.lambda$downloadOrUpdate$10(AsyncLocalizer.java:297)
>  ~[storm-server-2.3.0.y.jar:2.3.0.y]
> at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
>  ~[?:1.8.0_262]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_262]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_262]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>  ~[?:1.8.0_262]
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  ~[?:1.8.0_262]

[jira] [Resolved] (STORM-3714) Add rate information for TaskMetrics

2021-01-05 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3714.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Add rate information for TaskMetrics
> 
>
> Key: STORM-3714
> URL: https://issues.apache.org/jira/browse/STORM-3714
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> While converting TaskMetrics to use V2 API, we used Counters over Meters due 
> to performance implications.  We have found we would like to add rate 
> information as well.
>  
> Ideally we would add some kind of metric that supports a count and rate 
> without the full performance overhead of the Meter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3733) AsyncLocalizer stuck looking for missing topology

2021-01-05 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3733:
---

 Summary: AsyncLocalizer stuck looking for missing topology
 Key: STORM-3733
 URL: https://issues.apache.org/jira/browse/STORM-3733
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


{code:java}
2020-11-09 20:18:12.325 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 2 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:18:43.744 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 2 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:19:14.726 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 2 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:19:46.148 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 2 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:20:16.560 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 0 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:20:47.990 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 0 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:21:19.403 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 0 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:21:50.818 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 0 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:22:21.257 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 1 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:22:52.668 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 1 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:23:24.082 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 1 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:23:55.512 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 1 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:24:25.919 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 2 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:24:57.343 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 2 
[ERROR] Could not update blob, will retry again later
2020-11-09 20:25:28.773 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 2 
[ERROR] Could not update blob, will retry again later

2020-11-09 20:16:09.659 o.a.s.l.AsyncLocalizer AsyncLocalizer Task Executor - 1 
[ERROR] Could not update blob, will retry again later
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not 
download...
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
~[?:1.8.0_262]
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) 
~[?:1.8.0_262]
at 
org.apache.storm.localizer.AsyncLocalizer.updateBlobs(AsyncLocalizer.java:333) 
~[storm-server-2.3.0.y.jar:2.3.0.y]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_262]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[?:1.8.0_262]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_262]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [?:1.8.0_262]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_262]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_262]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
Caused by: java.lang.RuntimeException: Could not download...
at 
org.apache.storm.localizer.AsyncLocalizer.lambda$downloadOrUpdate$10(AsyncLocalizer.java:297)
 ~[storm-server-2.3.0.y.jar:2.3.0.y]
at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
 ~[?:1.8.0_262]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_262]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_262]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 ~[?:1.8.0_262]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 ~[?:1.8.0_262]
... 3 more
Caused by: org.apache.storm.utils.WrappedKeyNotFoundException: 
testTopology-743-1603821880-stormconf.ser
at 
org.apache.storm.hdfs.blobstore.HdfsBlobStore.getStoredBlobMeta(HdfsBlobStore.java:192)
 ~[storm-hdfs-blobstore-2.3.0.y.jar:2.3.0.y]
at 
org.apache.storm.hdfs.blobstore.HdfsBlobStore.getBlobMeta(HdfsBlobStore.java:221)
 ~[storm-hdfs-blobstore-2.3.0.y.jar:2.3.0.y]
at 

[jira] [Created] (STORM-3727) SUPERVISOR_SLOTS_PORTS could be list of Longs

2020-12-21 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3727:
---

 Summary: SUPERVISOR_SLOTS_PORTS could be list of Longs
 Key: STORM-3727
 URL: https://issues.apache.org/jira/browse/STORM-3727
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Aaron Gresch
Assignee: Aaron Gresch


 

A user reported:

There's no guarantee that the {{supervisorConf.getOrDefault}} will be a List of 
Integers.
Additionally, in ReadClusterState.java, {{.intValue()}} conversion is removed. 
Overall result

 

{{java.lang.ClassCastException: java.lang.Long cannot be cast to 
java.lang.Integer
at 
org.apache.storm.daemon.supervisor.ReadClusterState.(ReadClusterState.java:101)
 ~[storm-server-2.2.0.jar:2.2.0]
at 
org.apache.storm.daemon.supervisor.Supervisor.launch(Supervisor.java:310) 
~[storm-server-2.2.0.jar:2.2.0]}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3724) Use blobstore dir modtime to avoid update lookups by HDFSBlobstore

2020-12-16 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3724:
---

 Summary: Use blobstore dir modtime to avoid update lookups by 
HDFSBlobstore
 Key: STORM-3724
 URL: https://issues.apache.org/jira/browse/STORM-3724
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


We have multiple storm clusters with 100's of supervisors polling for blob 
updates.  This causes high load on our Hadoop namenodes that are also used by 
multiple other clusters.

 

An improvement would be for the AsyncLocalizer to check the remote blobstore 
last mod time once and then skip checking each individual blob if it was 
already checked for the same mod time.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3720) BlobStoreFile getModTime() never updates after first call

2020-12-10 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3720.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> BlobStoreFile getModTime() never updates after first call
> -
>
> Key: STORM-3720
> URL: https://issues.apache.org/jira/browse/STORM-3720
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If a blobstore file gets updated after a call to getModTime(), it will get an 
> incorrect result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3719) Add configuration for AsyncLocalizer updateBlobs frequency

2020-12-08 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3719.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Add configuration for AsyncLocalizer updateBlobs frequency
> --
>
> Key: STORM-3719
> URL: https://issues.apache.org/jira/browse/STORM-3719
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3718) Updating the dropwizard dependency

2020-12-08 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3718.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Updating the dropwizard dependency
> --
>
> Key: STORM-3718
> URL: https://issues.apache.org/jira/browse/STORM-3718
> Project: Apache Storm
>  Issue Type: Dependency upgrade
>Reporter: Fannyu Chien
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I currently work at JPMC and we are using storm, when we used Aqua to scan 
> the image it detected critical vulnerabilities regarding dropwizard 1.3.5 and 
> recommended patching it to 1.3.19. I have submitted a pull request already 
> but I opened up the issue on Jira just incase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3720) BlobStoreFile getModTime() never updates after first call

2020-12-04 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3720:
---

 Summary: BlobStoreFile getModTime() never updates after first call
 Key: STORM-3720
 URL: https://issues.apache.org/jira/browse/STORM-3720
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


If a blobstore file gets updated after a call to getModTime(), it will get an 
incorrect result



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3719) Add configuration for AsyncLocalizer updateBlobs frequency

2020-12-02 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3719:
---

 Summary: Add configuration for AsyncLocalizer updateBlobs frequency
 Key: STORM-3719
 URL: https://issues.apache.org/jira/browse/STORM-3719
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3714) Add rate information for TaskMetrics

2020-11-10 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3714:
---

 Summary: Add rate information for TaskMetrics
 Key: STORM-3714
 URL: https://issues.apache.org/jira/browse/STORM-3714
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


While converting TaskMetrics to use V2 API, we used Counters over Meters due to 
performance implications.  We have found we would like to add rate information 
as well.

 

Ideally we would add some kind of metric that supports a count and rate without 
the full performance overhead of the Meter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3707) Add meter to track update blob failures

2020-10-29 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3707.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Add meter to track update blob failures
> ---
>
> Key: STORM-3707
> URL: https://issues.apache.org/jira/browse/STORM-3707
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3707) Add meter to track update blob failures

2020-10-27 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3707:
---

 Summary: Add meter to track update blob failures
 Key: STORM-3707
 URL: https://issues.apache.org/jira/browse/STORM-3707
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3696) ClientSupervisorUtils.processLauncherAndWait ignores InterruptedException

2020-09-09 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3696.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> ClientSupervisorUtils.processLauncherAndWait ignores InterruptedException
> -
>
> Key: STORM-3696
> URL: https://issues.apache.org/jira/browse/STORM-3696
> Project: Apache Storm
>  Issue Type: Bug
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This code ignores the interrupted exception, going on and causing 
> IllegalThreadStateException.  We should not ignore the exception.
>  
> {code:java}
> 2020-09-04 17:36:02.766 o.a.s.d.s.ClientSupervisorUtils SLOT_6700 [INFO] 
> Worker Process 2e82e6e8-6a97-45eb-950f-2ca68ff793f4 interrupted.
> 2020-09-04 17:36:02.767 o.a.s.d.s.Slot SLOT_6700 [ERROR] Failed launching 
> container
> java.lang.IllegalThreadStateException: process hasn't exited
> at java.lang.UNIXProcess.exitValue(UNIXProcess.java:422) 
> ~[?:1.8.0_242]
> at 
> org.apache.storm.daemon.supervisor.ClientSupervisorUtils.processLauncherAndWait(ClientSupervisorUtils.java:82)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3697) Add metric for capacity

2020-09-09 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3697:
---

 Summary: Add metric for capacity
 Key: STORM-3697
 URL: https://issues.apache.org/jira/browse/STORM-3697
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


We don't report metrics for capacity except on the UI.  It would be nice for 
users to have reported metric for this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3696) ClientSupervisorUtils.processLauncherAndWait ignores InterruptedException

2020-09-08 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3696:
---

 Summary: ClientSupervisorUtils.processLauncherAndWait ignores 
InterruptedException
 Key: STORM-3696
 URL: https://issues.apache.org/jira/browse/STORM-3696
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch


This code ignores the interrupted exception, going on and causing 
IllegalThreadStateException.  We should not ignore the exception.

 
{code:java}
2020-09-04 17:36:02.766 o.a.s.d.s.ClientSupervisorUtils SLOT_6700 [INFO] Worker 
Process 2e82e6e8-6a97-45eb-950f-2ca68ff793f4 interrupted.


2020-09-04 17:36:02.767 o.a.s.d.s.Slot SLOT_6700 [ERROR] Failed launching 
container
java.lang.IllegalThreadStateException: process hasn't exited
at java.lang.UNIXProcess.exitValue(UNIXProcess.java:422) ~[?:1.8.0_242]
at 
org.apache.storm.daemon.supervisor.ClientSupervisorUtils.processLauncherAndWait(ClientSupervisorUtils.java:82)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3695) Timer rates not added to datapoints for V2 metrics tick

2020-09-03 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3695:
---

 Summary: Timer rates not added to datapoints for V2 metrics tick
 Key: STORM-3695
 URL: https://issues.apache.org/jira/browse/STORM-3695
 Project: Apache Storm
  Issue Type: Bug
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3694) all V2 metric reporters to report metric short names with dimensions

2020-09-02 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3694:
---

 Summary: all V2 metric reporters to report metric short names with 
dimensions
 Key: STORM-3694
 URL: https://issues.apache.org/jira/browse/STORM-3694
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


 

Given a metric such as:

storm.topology.mytopologyname-17-1595349167.hostname.__system.-1.6700-memory.pools.Code-Cache.max

It would be nice to instead be able to report it as:

memory.pools.Code-Cache.max with dimensions task Id of -1 and component Id of 
__system

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3682) Upgrade netty client metrics to use V2 API

2020-08-03 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3682:
---

 Summary: Upgrade netty client metrics to use V2 API
 Key: STORM-3682
 URL: https://issues.apache.org/jira/browse/STORM-3682
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3676) Reduce debug spew to scheduler log

2020-07-20 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3676.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Reduce debug spew to scheduler log
> --
>
> Key: STORM-3676
> URL: https://issues.apache.org/jira/browse/STORM-3676
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This line accounted for 30% of the lines in our scheduler log.  Seems 
> unnecessary.  We already log if there's an error parsing.
> 2020-07-11 21:21:46.730 o.a.s.s.r.n.NormalizedResourceRequest timer [DEBUG] 
> Input to parseResources 
> \{"topology.tasks":1,"topology.tick.tuple.freq.secs":5}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3676) Reduce debug spew to scheduler log

2020-07-16 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3676:
---

 Summary: Reduce debug spew to scheduler log
 Key: STORM-3676
 URL: https://issues.apache.org/jira/browse/STORM-3676
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


This line accounted for 30% of the lines in our scheduler log.  Seems 
unnecessary.  We already log if there's an error parsing.

2020-07-11 21:21:46.730 o.a.s.s.r.n.NormalizedResourceRequest timer [DEBUG] 
Input to parseResources \{"topology.tasks":1,"topology.tick.tuple.freq.secs":5}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3673) update BuiltinMetrics to use v2 Metrics API

2020-07-14 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3673:
---

 Summary: update BuiltinMetrics to use v2 Metrics API
 Key: STORM-3673
 URL: https://issues.apache.org/jira/browse/STORM-3673
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3656) Change handling of Hadoop TGT renewal exception

2020-06-25 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3656.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Change handling of Hadoop TGT renewal exception
> ---
>
> Key: STORM-3656
> URL: https://issues.apache.org/jira/browse/STORM-3656
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> STORM-3606 identified an issue where Hadoop's TGT auto renewal thread causes 
> an exception and worker restart.  The fix involved a lot of reflection calls 
> to emulate the Hadoop code while avoiding launching this thread.
>  
> It's possible Hadoop could change their code, causing this reflection to 
> fail.   To handle this case, we could instead allow Hadoop to launch this 
> autorenewal thread, and have the worker catch the specific NPE from Hadoop in 
> the exception handler.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3653) Document treatment of common nodes in favored/unfavored nodes in scheduling

2020-06-24 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3653.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> Document treatment of common nodes in favored/unfavored nodes in scheduling
> ---
>
> Key: STORM-3653
> URL: https://issues.apache.org/jira/browse/STORM-3653
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-server
>Affects Versions: 2.1.0
>Reporter: Bipin Prasad
>Assignee: Bipin Prasad
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the same node is specified in favored as well as unfavored nodes, the 
> current undocumented behavior is to removed the node from unfavored list. 
> Update javadoc to reflect this behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (STORM-3656) Change handling of Hadoop TGT renewal exception

2020-06-24 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch updated STORM-3656:

Affects Version/s: 2.2.0

> Change handling of Hadoop TGT renewal exception
> ---
>
> Key: STORM-3656
> URL: https://issues.apache.org/jira/browse/STORM-3656
> Project: Apache Storm
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
>
> STORM-3606 identified an issue where Hadoop's TGT auto renewal thread causes 
> an exception and worker restart.  The fix involved a lot of reflection calls 
> to emulate the Hadoop code while avoiding launching this thread.
>  
> It's possible Hadoop could change their code, causing this reflection to 
> fail.   To handle this case, we could instead allow Hadoop to launch this 
> autorenewal thread, and have the worker catch the specific NPE from Hadoop in 
> the exception handler.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3656) Change handling of Hadoop TGT renewal exception

2020-06-24 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3656:
---

 Summary: Change handling of Hadoop TGT renewal exception
 Key: STORM-3656
 URL: https://issues.apache.org/jira/browse/STORM-3656
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


STORM-3606 identified an issue where Hadoop's TGT auto renewal thread causes an 
exception and worker restart.  The fix involved a lot of reflection calls to 
emulate the Hadoop code while avoiding launching this thread.

 

It's possible Hadoop could change their code, causing this reflection to fail.  
 To handle this case, we could instead allow Hadoop to launch this autorenewal 
thread, and have the worker catch the specific NPE from Hadoop in the exception 
handler.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3648) Add meter to track worker heartbeat rate

2020-06-05 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3648:
---

 Summary: Add meter to track worker heartbeat rate
 Key: STORM-3648
 URL: https://issues.apache.org/jira/browse/STORM-3648
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


Users could track and alert if heartbeat rate starts dropping due to GC, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3645) worker launcher should consistently use ERRORFILE for error messages

2020-06-04 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3645:
---

 Summary: worker launcher should consistently use ERRORFILE for 
error messages
 Key: STORM-3645
 URL: https://issues.apache.org/jira/browse/STORM-3645
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3641) switch JCQueue metrics to new metrics API

2020-06-03 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3641.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> switch JCQueue metrics to new metrics API
> -
>
> Key: STORM-3641
> URL: https://issues.apache.org/jira/browse/STORM-3641
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3642) update AutoTGT metric to new API

2020-06-03 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3642.
-
Fix Version/s: 2.3.0
   Resolution: Fixed

> update AutoTGT metric to new API
> 
>
> Key: STORM-3642
> URL: https://issues.apache.org/jira/browse/STORM-3642
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3644) improve PacemakerClient error messaging

2020-06-02 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3644:
---

 Summary: improve PacemakerClient error messaging
 Key: STORM-3644
 URL: https://issues.apache.org/jira/browse/STORM-3644
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


LOG.error("error attempting to write to a channel {}.", e.getMessage());

 

This line could add the hostname and properly log the message.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3643) Bring queue metrics documentation up to date

2020-06-02 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3643:
---

 Summary: Bring queue metrics documentation up to date
 Key: STORM-3643
 URL: https://issues.apache.org/jira/browse/STORM-3643
 Project: Apache Storm
  Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Aaron Gresch


[https://github.com/apache/storm/blob/master/docs/Metrics.md#queue-metrics] 
should be updated 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3642) update AutoTGT metric to new API

2020-06-01 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3642:
---

 Summary: update AutoTGT metric to new API
 Key: STORM-3642
 URL: https://issues.apache.org/jira/browse/STORM-3642
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3640) timed out health check processes should be killed

2020-06-01 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3640.
-
Resolution: Fixed

> timed out health check processes should be killed
> -
>
> Key: STORM-3640
> URL: https://issues.apache.org/jira/browse/STORM-3640
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We noticed some hung health check scripts that were timed up eating CPU.  We 
> should make sure they are killed on timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3641) switch JCQueue metrics to new metrics API

2020-06-01 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3641:
---

 Summary: switch JCQueue metrics to new metrics API
 Key: STORM-3641
 URL: https://issues.apache.org/jira/browse/STORM-3641
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3640) timed out health check processes should be killed

2020-05-28 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3640:
---

 Summary: timed out health check processes should be killed
 Key: STORM-3640
 URL: https://issues.apache.org/jira/browse/STORM-3640
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


We noticed some hung health check scripts that were timed up eating CPU.  We 
should make sure they are killed on timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (STORM-3634) validate numa ports are contained in supervisor.slots.ports

2020-05-14 Thread Aaron Gresch (Jira)


[ 
https://issues.apache.org/jira/browse/STORM-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107531#comment-17107531
 ] 

Aaron Gresch commented on STORM-3634:
-

Decision was made to instead use the superset of numa ports and slot ports.

> validate numa ports are contained in supervisor.slots.ports
> ---
>
> Key: STORM-3634
> URL: https://issues.apache.org/jira/browse/STORM-3634
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It's currently possible to have a numa port configured that is not in 
> supervisor.slots.ports.  When a supervisor restarts, any worker running on 
> numa ports not in supervisor.slots.ports will be killed.  We should consider 
> this an invalid configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3634) validate numa ports are contained in supervisor.slots.ports

2020-05-14 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3634.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> validate numa ports are contained in supervisor.slots.ports
> ---
>
> Key: STORM-3634
> URL: https://issues.apache.org/jira/browse/STORM-3634
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It's currently possible to have a numa port configured that is not in 
> supervisor.slots.ports.  When a supervisor restarts, any worker running on 
> numa ports not in supervisor.slots.ports will be killed.  We should consider 
> this an invalid configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3633) Add message that supervisor is killing detached workers

2020-05-13 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3633.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> Add message that supervisor is killing detached workers
> ---
>
> Key: STORM-3633
> URL: https://issues.apache.org/jira/browse/STORM-3633
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.2.0
>
>
> [https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/daemon/supervisor/ReadClusterState.java#L116]
> From this code we will see messages that workers are killed, but not the 
> reason why.  We should add a message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3635) update LocalityAwareness documentation due to STORM-3602

2020-05-12 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3635:
---

 Summary: update LocalityAwareness documentation due to STORM-3602
 Key: STORM-3635
 URL: https://issues.apache.org/jira/browse/STORM-3635
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


The switching to the lower bound changed, doc should change as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3634) validate numa ports are contained in supervisor.slots.ports

2020-05-12 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3634:
---

 Summary: validate numa ports are contained in 
supervisor.slots.ports
 Key: STORM-3634
 URL: https://issues.apache.org/jira/browse/STORM-3634
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


It's currently possible to have a numa port configured that is not in 
supervisor.slots.ports.  When a supervisor restarts, any worker running on numa 
ports not in supervisor.slots.ports will be killed.  We should consider this an 
invalid configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (STORM-3633) Add message that supervisor is killing detached workers

2020-05-12 Thread Aaron Gresch (Jira)
Aaron Gresch created STORM-3633:
---

 Summary: Add message that supervisor is killing detached workers
 Key: STORM-3633
 URL: https://issues.apache.org/jira/browse/STORM-3633
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Aaron Gresch
Assignee: Aaron Gresch


[https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/daemon/supervisor/ReadClusterState.java#L116]


>From this code we will see messages that workers are killed, but not the 
>reason why.  We should add a message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (STORM-3632) Reduce SimpleSaslServerCallbackHandler supervisor logging

2020-05-08 Thread Aaron Gresch (Jira)


 [ 
https://issues.apache.org/jira/browse/STORM-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Gresch resolved STORM-3632.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> Reduce SimpleSaslServerCallbackHandler supervisor logging
> -
>
> Key: STORM-3632
> URL: https://issues.apache.org/jira/browse/STORM-3632
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Aaron Gresch
>Assignee: Aaron Gresch
>Priority: Minor
> Fix For: 2.2.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This message floods our logs and seems to provide little use:
> LOG.info("Successfully authenticated client: authenticationID = {} 
> authorizationID = {}",
>  nid, zid);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >