[GitHub] [samza] ajothomas merged pull request #1649: [Docs] Samza 1.8.0 release website updates

2023-01-18 Thread GitBox


ajothomas merged PR #1649:
URL: https://github.com/apache/samza/pull/1649


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas opened a new pull request, #1649: [Docs] Samza 1.8.0 release website updates

2023-01-17 Thread GitBox


ajothomas opened a new pull request, #1649:
URL: https://github.com/apache/samza/pull/1649

   This PR has all website/doc related changes w.r.t to samza 1.8.0 release. 
Since we did not add anything for 1.7.0 release, this PR adds a release.md and 
download links for 1.7.0 release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2023-01-14 Thread GitBox


li-afaris commented on PR #1647:
URL: https://github.com/apache/samza/pull/1647#issuecomment-1382703530

   Good catch.  I updated publish-site.sh and tested the best that I could as I 
don't have SVN write access.  Let me know if it doesn't work correctly.  Thanks
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1204: Bump jekyll from 3.4.5 to 3.6.3 in /docs

2023-01-12 Thread GitBox


li-afaris commented on PR #1204:
URL: https://github.com/apache/samza/pull/1204#issuecomment-1381196091

   This can probably be closed "won't-fix" as Jekyll 4.3.1 is in the dependency 
tree from pull request #1647.  The following is from `./gradlew 
docs:dependencies`
   
   ```
   gems
   +--- rubygems:jekyll-sass-converter:2.2.0
   |\--- rubygems:sassc:]2.0.1,3.0[ -> 2.4.0
   | \--- rubygems:ffi:[1.9.0,2.0[ -> 1.15.5
   +--- rubygems:jekyll:4.3.1
   |+--- rubygems:addressable:[2.4.0,3.0[ -> 2.8.1
   ||\--- rubygems:public_suffix:[2.0.2,6.0[ -> 5.0.1
   |+--- rubygems:colorator:[1.0.0,2.0[ -> 1.1.0
   |+--- rubygems:em-websocket:[0.5.0,1.0[ -> 0.5.3
   ||+--- rubygems:eventmachine:[0.12.9,) -> 1.2.7
   ||\--- rubygems:http_parser.rb:[0,) -> 0.6.0
   |+--- rubygems:i18n:[1.0.0,2.0[ -> 1.12.0
   ||\--- rubygems:concurrent-ruby:[1.0.0,2.0[ -> 1.1.10
   |+--- rubygems:jekyll-sass-converter:[2.0,4.0[ -> 2.2.0 (*)
   |+--- rubygems:jekyll-watch:[2.0.0,3.0[ -> 2.2.1
   ||\--- rubygems:listen:[3.0.0,4.0[ -> 3.8.0
   || +--- rubygems:rb-fsevent:[0.10.3,1.0[ -> 0.11.2
   || \--- rubygems:rb-inotify:[0.9.10,1.0[ -> 0.10.1
   ||  \--- rubygems:ffi:[1.0.0,2.0[ -> 1.15.5
   |+--- rubygems:kramdown:[2.3.1,3.0[ -> 2.4.0
   ||\--- rubygems:rexml:[0,) -> 3.2.5
   |+--- rubygems:kramdown-parser-gfm:[1.0.0,2.0[ -> 1.1.0
   ||\--- rubygems:kramdown:[2.0.0,3.0[ -> 2.4.0 (*)
   |+--- rubygems:liquid:[4.0.0,5.0[ -> 4.0.4
   |+--- rubygems:mercenary:[0.3.6,0.5[ -> 0.4.0
   |+--- rubygems:pathutil:[0.9.0,1.0[ -> 0.16.2
   ||\--- rubygems:forwardable-extended:[2.6.0,3.0[ -> 2.6.0
   |+--- rubygems:rouge:[3.0,5.0[ -> 3.30.0
   |+--- rubygems:safe_yaml:[1.0.0,2.0[ -> 1.0.5
   |+--- rubygems:terminal-table:[1.8,4.0[ -> 3.0.2
   ||\--- rubygems:unicode-display_width:[1.1.1,3[ -> 2.4.2
   |\--- rubygems:webrick:[1.7.0,2.0[ -> 1.7.0
   +--- rubygems:http_parser.rb:0.6.0
   +--- rubygems:webrick:1.7.0
   +--- rubygems:json:2.6.3
   +--- rubygems:rouge:3.30.0
   +--- rubygems:kramdown:2.4.0 (*)
   \--- rubygems:kramdown-parser-gfm:1.1.0 (*)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1621: Bump ffi from 1.9.23 to 1.15.5 in /docs

2023-01-12 Thread GitBox


li-afaris commented on PR #1621:
URL: https://github.com/apache/samza/pull/1621#issuecomment-1381195077

   This can probably be closed "won't-fix" as ffi was upgraded to 1.15.5 from 
pull request #1647.  The following is from `./gradlew docs:dependencies`
   
   ```
   gems
   +--- rubygems:jekyll-sass-converter:2.2.0
   |\--- rubygems:sassc:]2.0.1,3.0[ -> 2.4.0
   | \--- rubygems:ffi:[1.9.0,2.0[ -> 1.15.5
   +--- rubygems:jekyll:4.3.1
   |+--- rubygems:addressable:[2.4.0,3.0[ -> 2.8.1
   ||\--- rubygems:public_suffix:[2.0.2,6.0[ -> 5.0.1
   |+--- rubygems:colorator:[1.0.0,2.0[ -> 1.1.0
   |+--- rubygems:em-websocket:[0.5.0,1.0[ -> 0.5.3
   ||+--- rubygems:eventmachine:[0.12.9,) -> 1.2.7
   ||\--- rubygems:http_parser.rb:[0,) -> 0.6.0
   |+--- rubygems:i18n:[1.0.0,2.0[ -> 1.12.0
   ||\--- rubygems:concurrent-ruby:[1.0.0,2.0[ -> 1.1.10
   |+--- rubygems:jekyll-sass-converter:[2.0,4.0[ -> 2.2.0 (*)
   |+--- rubygems:jekyll-watch:[2.0.0,3.0[ -> 2.2.1
   ||\--- rubygems:listen:[3.0.0,4.0[ -> 3.8.0
   || +--- rubygems:rb-fsevent:[0.10.3,1.0[ -> 0.11.2
   || \--- rubygems:rb-inotify:[0.9.10,1.0[ -> 0.10.1
   ||  \--- rubygems:ffi:[1.0.0,2.0[ -> 1.15.5
   |+--- rubygems:kramdown:[2.3.1,3.0[ -> 2.4.0
   ||\--- rubygems:rexml:[0,) -> 3.2.5
   |+--- rubygems:kramdown-parser-gfm:[1.0.0,2.0[ -> 1.1.0
   ||\--- rubygems:kramdown:[2.0.0,3.0[ -> 2.4.0 (*)
   |+--- rubygems:liquid:[4.0.0,5.0[ -> 4.0.4
   |+--- rubygems:mercenary:[0.3.6,0.5[ -> 0.4.0
   |+--- rubygems:pathutil:[0.9.0,1.0[ -> 0.16.2
   ||\--- rubygems:forwardable-extended:[2.6.0,3.0[ -> 2.6.0
   |+--- rubygems:rouge:[3.0,5.0[ -> 3.30.0
   |+--- rubygems:safe_yaml:[1.0.0,2.0[ -> 1.0.5
   |+--- rubygems:terminal-table:[1.8,4.0[ -> 3.0.2
   ||\--- rubygems:unicode-display_width:[1.1.1,3[ -> 2.4.2
   |\--- rubygems:webrick:[1.7.0,2.0[ -> 1.7.0
   +--- rubygems:http_parser.rb:0.6.0
   +--- rubygems:webrick:1.7.0
   +--- rubygems:json:2.6.3
   +--- rubygems:rouge:3.30.0
   +--- rubygems:kramdown:2.4.0 (*)
   \--- rubygems:kramdown-parser-gfm:1.1.0 (*)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1623: Bump json from 2.1.0 to 2.3.0 in /docs

2023-01-12 Thread GitBox


li-afaris commented on PR #1623:
URL: https://github.com/apache/samza/pull/1623#issuecomment-1381194149

   This can probably be closed "won't-fix" as JSON 2.6.3 is in the dependency 
tree from pull request #1647.  The following is from `./gradlew 
docs:dependencies`
   
   ```
   gems
   +--- rubygems:jekyll-sass-converter:2.2.0
   |\--- rubygems:sassc:]2.0.1,3.0[ -> 2.4.0
   | \--- rubygems:ffi:[1.9.0,2.0[ -> 1.15.5
   +--- rubygems:jekyll:4.3.1
   |+--- rubygems:addressable:[2.4.0,3.0[ -> 2.8.1
   ||\--- rubygems:public_suffix:[2.0.2,6.0[ -> 5.0.1
   |+--- rubygems:colorator:[1.0.0,2.0[ -> 1.1.0
   |+--- rubygems:em-websocket:[0.5.0,1.0[ -> 0.5.3
   ||+--- rubygems:eventmachine:[0.12.9,) -> 1.2.7
   ||\--- rubygems:http_parser.rb:[0,) -> 0.6.0
   |+--- rubygems:i18n:[1.0.0,2.0[ -> 1.12.0
   ||\--- rubygems:concurrent-ruby:[1.0.0,2.0[ -> 1.1.10
   |+--- rubygems:jekyll-sass-converter:[2.0,4.0[ -> 2.2.0 (*)
   |+--- rubygems:jekyll-watch:[2.0.0,3.0[ -> 2.2.1
   ||\--- rubygems:listen:[3.0.0,4.0[ -> 3.8.0
   || +--- rubygems:rb-fsevent:[0.10.3,1.0[ -> 0.11.2
   || \--- rubygems:rb-inotify:[0.9.10,1.0[ -> 0.10.1
   ||  \--- rubygems:ffi:[1.0.0,2.0[ -> 1.15.5
   |+--- rubygems:kramdown:[2.3.1,3.0[ -> 2.4.0
   ||\--- rubygems:rexml:[0,) -> 3.2.5
   |+--- rubygems:kramdown-parser-gfm:[1.0.0,2.0[ -> 1.1.0
   ||\--- rubygems:kramdown:[2.0.0,3.0[ -> 2.4.0 (*)
   |+--- rubygems:liquid:[4.0.0,5.0[ -> 4.0.4
   |+--- rubygems:mercenary:[0.3.6,0.5[ -> 0.4.0
   |+--- rubygems:pathutil:[0.9.0,1.0[ -> 0.16.2
   ||\--- rubygems:forwardable-extended:[2.6.0,3.0[ -> 2.6.0
   |+--- rubygems:rouge:[3.0,5.0[ -> 3.30.0
   |+--- rubygems:safe_yaml:[1.0.0,2.0[ -> 1.0.5
   |+--- rubygems:terminal-table:[1.8,4.0[ -> 3.0.2
   ||\--- rubygems:unicode-display_width:[1.1.1,3[ -> 2.4.2
   |\--- rubygems:webrick:[1.7.0,2.0[ -> 1.7.0
   +--- rubygems:http_parser.rb:0.6.0
   +--- rubygems:webrick:1.7.0
   +--- rubygems:json:2.6.3
   +--- rubygems:rouge:3.30.0
   +--- rubygems:kramdown:2.4.0 (*)
   \--- rubygems:kramdown-parser-gfm:1.1.0 (*)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1624: Bump redcarpet from 3.4.0 to 3.5.1 in /docs

2023-01-12 Thread GitBox


li-afaris commented on PR #1624:
URL: https://github.com/apache/samza/pull/1624#issuecomment-1381193456

   This can probably be closed "won't-fix" as redcarpet was replaced with 
Kramdown from pull request #1647 and is no longer used.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1632: Bump addressable from 2.5.2 to 2.8.1 in /docs

2023-01-12 Thread GitBox


li-afaris commented on PR #1632:
URL: https://github.com/apache/samza/pull/1632#issuecomment-1381192497

   This can probably be closed "won't-fix" as addressable 2.8.1 is in the 
dependency tree from pull request #1647.  The following is from `./gradlew 
docs:dependencies`
   
   ```
   gems
   +--- rubygems:jekyll-sass-converter:2.2.0
   |\--- rubygems:sassc:]2.0.1,3.0[ -> 2.4.0
   | \--- rubygems:ffi:[1.9.0,2.0[ -> 1.15.5
   +--- rubygems:jekyll:4.3.1
   |+--- rubygems:addressable:[2.4.0,3.0[ -> 2.8.1
   ||\--- rubygems:public_suffix:[2.0.2,6.0[ -> 5.0.1
   |+--- rubygems:colorator:[1.0.0,2.0[ -> 1.1.0
   |+--- rubygems:em-websocket:[0.5.0,1.0[ -> 0.5.3
   ||+--- rubygems:eventmachine:[0.12.9,) -> 1.2.7
   ||\--- rubygems:http_parser.rb:[0,) -> 0.6.0
   |+--- rubygems:i18n:[1.0.0,2.0[ -> 1.12.0
   ||\--- rubygems:concurrent-ruby:[1.0.0,2.0[ -> 1.1.10
   |+--- rubygems:jekyll-sass-converter:[2.0,4.0[ -> 2.2.0 (*)
   |+--- rubygems:jekyll-watch:[2.0.0,3.0[ -> 2.2.1
   ||\--- rubygems:listen:[3.0.0,4.0[ -> 3.8.0
   || +--- rubygems:rb-fsevent:[0.10.3,1.0[ -> 0.11.2
   || \--- rubygems:rb-inotify:[0.9.10,1.0[ -> 0.10.1
   ||  \--- rubygems:ffi:[1.0.0,2.0[ -> 1.15.5
   |+--- rubygems:kramdown:[2.3.1,3.0[ -> 2.4.0
   ||\--- rubygems:rexml:[0,) -> 3.2.5
   |+--- rubygems:kramdown-parser-gfm:[1.0.0,2.0[ -> 1.1.0
   ||\--- rubygems:kramdown:[2.0.0,3.0[ -> 2.4.0 (*)
   |+--- rubygems:liquid:[4.0.0,5.0[ -> 4.0.4
   |+--- rubygems:mercenary:[0.3.6,0.5[ -> 0.4.0
   |+--- rubygems:pathutil:[0.9.0,1.0[ -> 0.16.2
   ||\--- rubygems:forwardable-extended:[2.6.0,3.0[ -> 2.6.0
   |+--- rubygems:rouge:[3.0,5.0[ -> 3.30.0
   |+--- rubygems:safe_yaml:[1.0.0,2.0[ -> 1.0.5
   |+--- rubygems:terminal-table:[1.8,4.0[ -> 3.0.2
   ||\--- rubygems:unicode-display_width:[1.1.1,3[ -> 2.4.2
   |\--- rubygems:webrick:[1.7.0,2.0[ -> 1.7.0
   +--- rubygems:http_parser.rb:0.6.0
   +--- rubygems:webrick:1.7.0
   +--- rubygems:json:2.6.3
   +--- rubygems:rouge:3.30.0
   +--- rubygems:kramdown:2.4.0 (*)
   \--- rubygems:kramdown-parser-gfm:1.1.0 (*)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on a diff in pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2023-01-12 Thread GitBox


li-afaris commented on code in PR #1647:
URL: https://github.com/apache/samza/pull/1647#discussion_r1068741225


##
docs/meetups/index.html:
##
@@ -21,7 +21,7 @@
 -->
 
 
-{% assign sorted = (site.meetups | sort: 'date') | reverse %}
+{% assign sorted = site.meetups | sort: 'date' | reverse %}

Review Comment:
   Yes this is due to the change from redcarpet to kramdown



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on a diff in pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2023-01-12 Thread GitBox


li-afaris commented on code in PR #1647:
URL: https://github.com/apache/samza/pull/1647#discussion_r1068741065


##
docs/talks/index.html:
##
@@ -21,7 +21,7 @@
 -->
 
 
-{% assign sorted = (site.talks | sort: 'date') | reverse %}
+{% assign sorted = site.talks | sort: 'date' | reverse %}

Review Comment:
   Yes this is due to the change from redcarpet to kramdown



##
docs/_blog/index.md:
##
@@ -22,7 +22,7 @@ exclude_from_loop: true
 
 
 
-  {% assign sorted = (site.blog | sort: 'date') | reverse %}
+  {% assign sorted = site.blog | sort: 'date' | reverse %}

Review Comment:
   Yes this is due to the change from redcarpet to kramdown



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on a diff in pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2023-01-12 Thread GitBox


mynameborat commented on code in PR #1647:
URL: https://github.com/apache/samza/pull/1647#discussion_r1068483167


##
docs/meetups/index.html:
##
@@ -21,7 +21,7 @@
 -->
 
 
-{% assign sorted = (site.meetups | sort: 'date') | reverse %}
+{% assign sorted = site.meetups | sort: 'date' | reverse %}

Review Comment:
   why do we need these changes? 



##
docs/talks/index.html:
##
@@ -21,7 +21,7 @@
 -->
 
 
-{% assign sorted = (site.talks | sort: 'date') | reverse %}
+{% assign sorted = site.talks | sort: 'date' | reverse %}

Review Comment:
   same as above. can we exclude these changes if they are not part of the 
original issue this RB is solving?



##
docs/_blog/index.md:
##
@@ -22,7 +22,7 @@ exclude_from_loop: true
 
 
 
-  {% assign sorted = (site.blog | sort: 'date') | reverse %}
+  {% assign sorted = site.blog | sort: 'date' | reverse %}

Review Comment:
   revert unless its related to this PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat merged pull request #1648: Update RocksDB to version 7.8.3 to reduce write amplification

2023-01-09 Thread GitBox


mynameborat merged PR #1648:
URL: https://github.com/apache/samza/pull/1648


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on pull request #1648: Update RocksDB to version 7.8.3 to reduce write amplification

2023-01-09 Thread GitBox


mynameborat commented on PR #1648:
URL: https://github.com/apache/samza/pull/1648#issuecomment-1376380697

   @shekhars-li can you run the build locally/publish another dummy commit to 
trigger the checks again? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] shekhars-li opened a new pull request, #1648: Update RocksDB to version 7.8.3 to reduce write amplification

2023-01-06 Thread GitBox


shekhars-li opened a new pull request, #1648:
URL: https://github.com/apache/samza/pull/1648

   What: Updated RocksDB to version 7.8.3
   Why: This version reduces the write compaction by aligning compaction output 
file boundaries. More details can be found in [this blog 
post](http://rocksdb.org/blog/2022/10/31/align-compaction-output-file.html). 
TL;DR: Write amplification was observed to be reduced by 10% by optimizing 
cutting the compaction output file earlier. 
   How: Feature is enabled by default in version 7.8.0 and above of RocksDB. 
   Tests: None. Past tests were re-run to verify they are still passing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2023-01-05 Thread GitBox


li-afaris commented on PR #1647:
URL: https://github.com/apache/samza/pull/1647#issuecomment-1372953087

   FYI: Removing the Jekyll incremental build flag resolved the file name 
differences I found in the previous comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] amf3 commented on pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2023-01-04 Thread GitBox


amf3 commented on PR #1647:
URL: https://github.com/apache/samza/pull/1647#issuecomment-1371765244

   I found the following differences after generating documents with Jekyll 
using both MRI Ruby & Jruby.  Input was generated by running find on the _site 
directory and saving the output to a sorted text file.
   
   ```
   $ sdiff -Ws -w 220 ~/mri_ruby_docs ~/jruby_docs  
   ./Gemfile
<
   ./Gemfile.lock   
 <
   ./index.md.bak   
   <
   ./meetups/aug-2016.md
  <
   ```
   
   The missing Gemfile, Gemfile.lock, & index.md.bak files I'm not concerned 
with as they were removed as part of this change.  The meetups/aug-2016.md file 
was a surprise.  FWIW: it's not referenced in the MRI Ruby build or at 
https://samza.apache.org/meetups so I'm not concerned about it right now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2022-12-21 Thread GitBox


li-afaris commented on PR #1647:
URL: https://github.com/apache/samza/pull/1647#issuecomment-1361904668

   > I'm open to suggestions for improving confidence relating to the Jekyll 
plugin changes.
   
   One thing I thought of was to list & compare the generated file names in 
_site.  I found some of the javadoc files were missing with the jruby generated 
site.  I think I need to revisit the Jekyll plugin changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris commented on pull request #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2022-12-20 Thread GitBox


li-afaris commented on PR #1647:
URL: https://github.com/apache/samza/pull/1647#issuecomment-1360645861

   I'm not sure why the last commit (ccb932d) failed testing when the previous 
one (651a1bc) worked.  You can see the diff by clicking the ccb932d link above.
   
   Is the samza-kafka_2.12:test task stable? I ran `./gradlew clean build test` 
locally and the build was successful.  Let me know if I need to trigger another 
build with an empty commit or if it's something else.  Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] li-afaris opened a new pull request, #1647: [SAMZA-2772] Manage Jekyll dependencies with Gradle

2022-12-20 Thread GitBox


li-afaris opened a new pull request, #1647:
URL: https://github.com/apache/samza/pull/1647

   # Summary
   This change moves Jekyll dependency management into Gradle. Contributors 
will no longer be required to install ruby & Jekyll by hand.
   
   # Changes
   * Gemfiles no longer manage dependencies.  This is done with the 
jruby-gradle-plugin Gradle plugin.  
* rubygems.org is still the source of Ruby libraries
   * The locally install MRI Ruby interpreter is replaced with JRuby.  
* This allows the Ruby interpreter to be a part of the Samza project.
   * Jekyll plugin changes
* Rouge replaces pygments.rb. Rouge is native Ruby and does not have a 
python requirement.
* Kramdown replaces Redcarpet because Redcarpet does not work with 
JRuby.
* Both Kramdown & Rouge can run inside the JRuby JVM.
   * buildscript.gradle
* jcenter.bintray.com is no longer active.
* plugins.gradle.org has the required plugins
   
   # Testing
   
   Testing was done with JDK 1.8 and on Linux.  I visually confirmed that 
content generated with JRuby looks like content generated by MRI Ruby by 
running two Jekyll instances at the same time.  Both Jekyll instances threw the 
same errors for missing content at `/learn/documentation/latest`.   I'm open to 
suggestions for improving confidence relating to the Jekyll plugin changes.
   
   I also ran `./gradlew test` which built successfully. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] xinyuiscool merged pull request #1646: [Docs] Add Ajo Thomas to committers list and GPG keys

2022-12-16 Thread GitBox


xinyuiscool merged PR #1646:
URL: https://github.com/apache/samza/pull/1646


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas opened a new pull request, #1646: [DOC] Add Ajo Thomas to committers list and GPG keys

2022-12-16 Thread GitBox


ajothomas opened a new pull request, #1646:
URL: https://github.com/apache/samza/pull/1646

   Add Ajo Thomas to committers list and GPG keys.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] dxichen merged pull request #1645: Fix restore factory not configured log

2022-12-15 Thread GitBox


dxichen merged PR #1645:
URL: https://github.com/apache/samza/pull/1645


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] dxichen opened a new pull request, #1645: Fix restore factory not configured log

2022-12-14 Thread GitBox


dxichen opened a new pull request, #1645:
URL: https://github.com/apache/samza/pull/1645

   Currently if a restore factory is missing due to a misconfiguration we exit 
due to a NPE
   
   ```
   2022-12-13 18:23:07.794 [main] SamzaContainer [ERROR] Caught exception/error 
while initializing container.
   java.lang.NullPointerException: null
at 
org.apache.samza.storage.ContainerStorageManager.lambda$createTaskRestoreManagers$15(ContainerStorageManager.java:485)
 ~[samza-core_2.12-320.1081.0.4.jar:?]
at java.util.HashMap.forEach(HashMap.java:1289) ~[?:1.8.0_282]
at 
org.apache.samza.storage.ContainerStorageManager.createTaskRestoreManagers(ContainerStorageManager.java:480)
 ~[samza-core_2.12-320.1081.0.4.jar:?]
at 
org.apache.samza.storage.ContainerStorageManager.lambda$restoreStores$27(ContainerStorageManager.java:857)
 ~[samza-core_2.12-320.1081.0.4.jar:?]
at java.util.LinkedHashMap.forEach(LinkedHashMap.java:684) 
~[?:1.8.0_282]
   ```
   Changed the behavior to output a more meaningful log


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] dxichen merged pull request #1644: Fix performance bug in kafka/blobstore commit lifecycle.

2022-11-28 Thread GitBox


dxichen merged PR #1644:
URL: https://github.com/apache/samza/pull/1644


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] shekhars-li opened a new pull request, #1644: Fix perfomance bug

2022-11-28 Thread GitBox


shekhars-li opened a new pull request, #1644:
URL: https://github.com/apache/samza/pull/1644

   Bug
   - We introduced blob store backed for state backup and restore in [this 
commit](https://github.com/apache/samza/commit/7cc4eaa96fff244f6dce9c18af804917db7c3b2b).
   - `StateBackendFactory` implementations like for Kafka and BlobStore create 
systemAdmins every time `getBackupFactory` is called from `SamzaContainer`. 
This leads to creation of duplicate kafka admin threads. 
   - This has impact on performance of large samza jobs as we have thousands of 
duplicate threads spawned.
   
   Fix
   - Pass `systemAdminsMap` to `getBackendFactory` method. This initializes the 
SystemAdmins with the system admin map rather than creating system admins every 
time the method is called. 
   
   Test
   - Tested with local jobs to verify the kafka admin threads are not 
duplicated. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] shanthoosh merged pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-16 Thread GitBox


shanthoosh merged PR #1641:
URL: https://github.com/apache/samza/pull/1641


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] shanthoosh commented on pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-16 Thread GitBox


shanthoosh commented on PR #1641:
URL: https://github.com/apache/samza/pull/1641#issuecomment-1317938599

   Thanks for adding the tests @jia-gao . After going through the linkedin 
documents and JIRA tickets, it seems like we are just proceeding with the 
approach finalized as a part of the investigation. 
   
   We had just removed the recursive call check from the log4j2-appender 
implementation and we're relying upon native-recursive check from log4j2. 
Mostly the changes look good to me, let us wait for the build to pass and we 
can merge this in!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] jia-gao commented on pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-16 Thread GitBox


jia-gao commented on PR #1641:
URL: https://github.com/apache/samza/pull/1641#issuecomment-1317936984

   > validating
   
   
   
   > @jia-gao As discussed offline, it would be great if we can add a unit-test 
for validating these changes in this patch.
   
   Added unit tests 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] xinyuiscool merged pull request #1642: SAMZA-2769: [PipelineDrain] Add drainMode to DrainNotification

2022-11-16 Thread GitBox


xinyuiscool merged PR #1642:
URL: https://github.com/apache/samza/pull/1642


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1642: SAMZA-2769: [PipelineDrain] Add drainMode to DrainNotification

2022-11-16 Thread GitBox


ajothomas commented on code in PR #1642:
URL: https://github.com/apache/samza/pull/1642#discussion_r1024611620


##
samza-core/src/main/java/org/apache/samza/drain/DrainUtils.java:
##
@@ -54,44 +74,34 @@ private DrainUtils() {
* @return generated uuid for the DrainNotification
*/
   @VisibleForTesting
-  public static UUID writeDrainNotification(MetadataStore metadataStore, 
String runId) {
+  public static UUID writeSimpleDrainNotification(MetadataStore metadataStore, 
String runId) {

Review Comment:
   Merged this with `writeDrainNotification(metadataStore)`



##
samza-core/src/main/java/org/apache/samza/drain/DrainUtils.java:
##
@@ -54,44 +74,34 @@ private DrainUtils() {
* @return generated uuid for the DrainNotification
*/
   @VisibleForTesting
-  public static UUID writeDrainNotification(MetadataStore metadataStore, 
String runId) {
+  public static UUID writeSimpleDrainNotification(MetadataStore metadataStore, 
String runId) {
 Preconditions.checkArgument(metadataStore != null, "MetadataStore cannot 
be null.");
 Preconditions.checkArgument(!Strings.isNullOrEmpty(runId), "runId should 
be non-null.");
 LOG.info("Attempting to write DrainNotification to metadata-store for the 
deployment ID {}", runId);
+final UUID uuid = UUID.randomUUID();
+final DrainNotification message = DrainNotification.simple(uuid, runId);
+return writeDrainNotification(metadataStore, message);
+  }
+
+  /**
+   * Writes a {@link DrainNotification} to an underlying metadata store.
+   * */
+  @VisibleForTesting
+  public static UUID writeDrainNotification(MetadataStore metadataStore, 
DrainNotification drainNotification) {

Review Comment:
   This is package private now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] shanthoosh commented on pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-16 Thread GitBox


shanthoosh commented on PR #1641:
URL: https://github.com/apache/samza/pull/1641#issuecomment-1317817165

   @jia-gao As discussed offline, it would be great if we can add a unit-test 
for validating these changes in this patch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1642: SAMZA-2769: [PipelineDrain] Add drainMode to DrainNotification

2022-11-16 Thread GitBox


ajothomas commented on code in PR #1642:
URL: https://github.com/apache/samza/pull/1642#discussion_r1024598787


##
samza-api/src/main/java/org/apache/samza/drain/DrainMode.java:
##
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.drain;
+
+/**
+ * Defines the type of drain operation.
+ * */
+public enum DrainMode {
+  /**
+   * Simple drain mode is the default behavior of the drain operation.
+   * All intermediate streams and any samza managed state will be drained and 
the pipeline will be gracefully shutdown
+   * as a result. User state will not be drained.
+   * */
+  SIMPLE;

Review Comment:
   Done



##
samza-api/src/main/java/org/apache/samza/drain/DrainNotification.java:
##
@@ -35,9 +35,23 @@ public class DrainNotification {
*/
   private final String runId;
 
-  public DrainNotification(UUID uuid, String runId) {
+  /***/
+  private final DrainMode drainMode;
+
+  public DrainNotification(UUID uuid, String runId, DrainMode drainMode) {
+Preconditions.checkNotNull(uuid);
+Preconditions.checkNotNull(runId);
+Preconditions.checkNotNull(drainMode);
 this.uuid = uuid;
 this.runId = runId;
+this.drainMode = drainMode;
+  }
+
+  /**
+   * Creates a DrainNotification in {@link DrainMode#SIMPLE} mode.
+   * */
+  public static DrainNotification simple(UUID uuid, String runId) {

Review Comment:
   Done



##
samza-core/src/main/java/org/apache/samza/drain/DrainUtils.java:
##
@@ -45,6 +45,26 @@ public class DrainUtils {
   private DrainUtils() {
   }
 
+  /**
+   * Writes a {@link DrainNotification} to the underlying metastore. This 
method should be used by external controllers
+   * to issue a DrainNotification to the JobCoordinator and Samza Containers.
+   * @param metadataStore Metadata store to write drain notification to.
+   *
+   * @return generated uuid for the DrainNotification
+   */
+  public static UUID writeSimpleDrainNotification(MetadataStore metadataStore) 
{

Review Comment:
   Changed it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] xinyuiscool commented on a diff in pull request #1642: SAMZA-2769: [PipelineDrain] Add drainMode to DrainNotification

2022-11-16 Thread GitBox


xinyuiscool commented on code in PR #1642:
URL: https://github.com/apache/samza/pull/1642#discussion_r1024576429


##
samza-api/src/main/java/org/apache/samza/drain/DrainMode.java:
##
@@ -0,0 +1,31 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.drain;
+
+/**
+ * Defines the type of drain operation.
+ * */
+public enum DrainMode {
+  /**
+   * Simple drain mode is the default behavior of the drain operation.
+   * All intermediate streams and any samza managed state will be drained and 
the pipeline will be gracefully shutdown
+   * as a result. User state will not be drained.
+   * */
+  SIMPLE;

Review Comment:
   As your javadoc suggests, let's call it default?



##
samza-api/src/main/java/org/apache/samza/drain/DrainNotification.java:
##
@@ -35,9 +35,23 @@ public class DrainNotification {
*/
   private final String runId;
 
-  public DrainNotification(UUID uuid, String runId) {
+  /***/
+  private final DrainMode drainMode;
+
+  public DrainNotification(UUID uuid, String runId, DrainMode drainMode) {
+Preconditions.checkNotNull(uuid);
+Preconditions.checkNotNull(runId);
+Preconditions.checkNotNull(drainMode);
 this.uuid = uuid;
 this.runId = runId;
+this.drainMode = drainMode;
+  }
+
+  /**
+   * Creates a DrainNotification in {@link DrainMode#SIMPLE} mode.
+   * */
+  public static DrainNotification simple(UUID uuid, String runId) {

Review Comment:
   let's just call it create(). We can have another variant factory method for 
create with other mode.



##
samza-core/src/main/java/org/apache/samza/drain/DrainUtils.java:
##
@@ -54,44 +74,34 @@ private DrainUtils() {
* @return generated uuid for the DrainNotification
*/
   @VisibleForTesting
-  public static UUID writeDrainNotification(MetadataStore metadataStore, 
String runId) {
+  public static UUID writeSimpleDrainNotification(MetadataStore metadataStore, 
String runId) {

Review Comment:
   If this is only used for testing, we should be able to test the variant of 
using DrainNotification. If not user-facing, let's get rid of it.



##
samza-core/src/main/java/org/apache/samza/drain/DrainUtils.java:
##
@@ -54,44 +74,34 @@ private DrainUtils() {
* @return generated uuid for the DrainNotification
*/
   @VisibleForTesting
-  public static UUID writeDrainNotification(MetadataStore metadataStore, 
String runId) {
+  public static UUID writeSimpleDrainNotification(MetadataStore metadataStore, 
String runId) {
 Preconditions.checkArgument(metadataStore != null, "MetadataStore cannot 
be null.");
 Preconditions.checkArgument(!Strings.isNullOrEmpty(runId), "runId should 
be non-null.");
 LOG.info("Attempting to write DrainNotification to metadata-store for the 
deployment ID {}", runId);
+final UUID uuid = UUID.randomUUID();
+final DrainNotification message = DrainNotification.simple(uuid, runId);
+return writeDrainNotification(metadataStore, message);
+  }
+
+  /**
+   * Writes a {@link DrainNotification} to an underlying metadata store.
+   * */
+  @VisibleForTesting
+  public static UUID writeDrainNotification(MetadataStore metadataStore, 
DrainNotification drainNotification) {

Review Comment:
   If it's only used for testing, please make it package private.



##
samza-core/src/main/java/org/apache/samza/drain/DrainUtils.java:
##
@@ -45,6 +45,26 @@ public class DrainUtils {
   private DrainUtils() {
   }
 
+  /**
+   * Writes a {@link DrainNotification} to the underlying metastore. This 
method should be used by external controllers
+   * to issue a DrainNotification to the JobCoordinator and Samza Containers.
+   * @param metadataStore Metadata store to write drain notification to.
+   *
+   * @return generated uuid for the DrainNotification
+   */
+  public static UUID writeSimpleDrainNotification(MetadataStore metadataStore) 
{

Review Comment:
   Let's just use writeDrainNotification(). We can overload it with other 
parameters later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To 

[GitHub] [samza] ajothomas closed pull request #1643: SAMZA-2770: [Pipeline Drain] Enable drain monitor by default

2022-11-16 Thread GitBox


ajothomas closed pull request #1643: SAMZA-2770: [Pipeline Drain] Enable drain 
monitor by default
URL: https://github.com/apache/samza/pull/1643


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas opened a new pull request, #1643: SAMZA-2770: [Pipeline Drain] Enable drain monitor by default

2022-11-16 Thread GitBox


ajothomas opened a new pull request, #1643:
URL: https://github.com/apache/samza/pull/1643

   # Summary
   Drain monitor had been turned off by default. The idea was to turn it on 
after testing the pipeline drain feature.
   
   # Changes
   - Set `DRAIN_MONITOR_ENABLED_DEFAULT` to true in `JobConfig`
   - Alter Drain integration tests which had explicitly set the 
`job.drain-monitor.enabled` to true
   
   # Tests
   - No new tests. Re-run existing integration tests
   
   # API Changes
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas opened a new pull request, #1642: SAMZA-2769: [PipelineDrain] Abstract DrainNotification to make it extensible

2022-11-16 Thread GitBox


ajothomas opened a new pull request, #1642:
URL: https://github.com/apache/samza/pull/1642

   # Summary
   The current `DrainNotification` is a concrete class tasked with signaling 
samza engine to clear intermediate state. It is currently inextensible and any 
changes to the class in the future to accommodate new features would make the 
class backward incompatible. As DrainNotifications are stored in the metadata 
store, we need to be make sure that it can be extended in a backward compatible 
way. 
   
   # Changes
   - Convert `DrainNotification` to an abstract class. Add 
`BasicDrainNotification` to represent a standard drain operation which involves 
samza clearing intermediate streams and any system managed state. Treat 
`DrainNotification` and `BasicDrainNotification` as api by putting in under 
`samza-api`
   - Change `DrainNotificationObjectMapper` 
   - Alter `DrainUtils` and `DrainMonitor` to accommodate the above changes
   
   # Tests
   - Tests for `DrainNotificationObjectMapper` 
   - Tests for `DrainUtils`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] jia-gao commented on a diff in pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-16 Thread GitBox


jia-gao commented on code in PR #1641:
URL: https://github.com/apache/samza/pull/1641#discussion_r1024297945


##
samza-log4j2/src/main/java/org/apache/samza/logging/log4j2/StreamAppenderMetrics.java:
##
@@ -46,7 +43,6 @@ public class StreamAppenderMetrics extends MetricsBase {
   public StreamAppenderMetrics(String prefix, MetricsRegistry registry) {
 super(prefix + "-", registry);
 bufferFillPct = newGauge("buffer-fill-percent", 0);
-recursiveCalls = newCounter("recursive-calls");

Review Comment:
   Didn't find StreamAppenderMetrics published in any oss samza doc
   
https://samza.apache.org/learn/documentation/latest/container/metrics-table.html



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-14 Thread GitBox


alnzng commented on code in PR #1641:
URL: https://github.com/apache/samza/pull/1641#discussion_r1021917783


##
samza-log4j2/src/main/java/org/apache/samza/logging/log4j2/StreamAppenderMetrics.java:
##
@@ -46,7 +43,6 @@ public class StreamAppenderMetrics extends MetricsBase {
   public StreamAppenderMetrics(String prefix, MetricsRegistry registry) {
 super(prefix + "-", registry);
 bufferFillPct = newGauge("buffer-fill-percent", 0);
-recursiveCalls = newCounter("recursive-calls");

Review Comment:
   is this metric published in some Samza doc? If yes, let's remove it from 
there as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] jia-gao commented on pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-14 Thread GitBox


jia-gao commented on PR #1641:
URL: https://github.com/apache/samza/pull/1641#issuecomment-1314166098

   > Is the default support to prevent recursion not available when this change 
was introduced or did we upgrade to newer version of log4j that adds this 
prevention at the framework layer?
   
   log4j1 doesn't have this prevention that's when we added the change in 
samza. Later we upgraded to log4j2 which has he prevention at the framework 
layer. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat merged pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-11-09 Thread GitBox


mynameborat merged PR #1637:
URL: https://github.com/apache/samza/pull/1637


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-11-09 Thread GitBox


ajothomas commented on code in PR #1637:
URL: https://github.com/apache/samza/pull/1637#discussion_r1018537087


##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -106,7 +110,10 @@ public void run() {
   callback.failure(new SamzaException(msg));
 }
   };
-  ScheduledFuture scheduledFuture = timer.schedule(timerTask, timeout, 
TimeUnit.MILLISECONDS);
+
+  final ScheduledFuture scheduledFuture = isDraining && 
(envelope.isDrain() || envelope.isWatermark())
+  ? timer.schedule(timerTask, drainCallbackTimeout, 
TimeUnit.MILLISECONDS)
+  : timer.schedule(timerTask, timeout, TimeUnit.MILLISECONDS);

Review Comment:
   Added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-11-09 Thread GitBox


ajothomas commented on code in PR #1637:
URL: https://github.com/apache/samza/pull/1637#discussion_r1018502824


##
samza-core/src/main/java/org/apache/samza/config/TaskConfig.java:
##
@@ -90,6 +90,13 @@ public class TaskConfig extends MapConfig {
   // timeout for triggering a callback
   public static final String CALLBACK_TIMEOUT_MS = "task.callback.timeout.ms";
   static final long DEFAULT_CALLBACK_TIMEOUT_MS = -1L;
+
+  // timeout for triggering a callback during drain
+  public static final String DRAIN_CALLBACK_TIMEOUT_MS = 
"task.callback.drain.timeout.ms";
+
+  // default timeout for triggering a callback during drain
+  static final long DEFAULT_DRAIN_CALLBACK_TIMEOUT_MS = -1L;

Review Comment:
   The defaults for the process callback timeout and drain callback timeout are 
both -1 here. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-11-09 Thread GitBox


ajothomas commented on code in PR #1637:
URL: https://github.com/apache/samza/pull/1637#discussion_r1018501997


##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -106,7 +110,10 @@ public void run() {
   callback.failure(new SamzaException(msg));
 }
   };
-  ScheduledFuture scheduledFuture = timer.schedule(timerTask, timeout, 
TimeUnit.MILLISECONDS);
+
+  final ScheduledFuture scheduledFuture = isDraining && 
(envelope.isDrain() || envelope.isWatermark())

Review Comment:
   Thanks, refactored the `TaskCallbackManager`'s `createCallback` method to 
incorporate timeout and moved the logic to decide what timeout to use to the 
`process()` method inside `AsyncTaskWorker`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-11-09 Thread GitBox


ajothomas commented on code in PR #1637:
URL: https://github.com/apache/samza/pull/1637#discussion_r1018500682


##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -127,4 +127,14 @@ public List 
updateCallback(TaskCallbackImpl callback) {
   return ImmutableList.of(callback);
 }
   }
+
+  /**
+   * Override the timeout set in the callback manager with the given new 
timeout.
+   * This is intended to be used with pipeline drain as we want to override 
the existing timeout with a higher timeout.
+   *
+   * @param timeout new timeout for process callbacks
+   * */
+  public void updateTaskCallbackTimeout(long timeout) {

Review Comment:
   Changed this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] xinyuiscool merged pull request #1640: SAMZA-2741: [Pipeline Drain] Fix ApplicationUtil.isHighLevelApiJob to work for anonymous and lambda `SamzaApplication` classes

2022-11-08 Thread GitBox


xinyuiscool merged PR #1640:
URL: https://github.com/apache/samza/pull/1640


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on a diff in pull request #1640: SAMZA-2741: [Pipeline Drain] Fix ApplicationUtil.isHighLevelApiJob to work for anonymous and lambda `SamzaApplication` classes

2022-11-08 Thread GitBox


mynameborat commented on code in PR #1640:
URL: https://github.com/apache/samza/pull/1640#discussion_r1017221725


##
samza-core/src/main/java/org/apache/samza/config/ApplicationConfig.java:
##
@@ -46,6 +46,11 @@ public enum ApplicationMode {
 BATCH
   }
 
+  public enum ApplicationApiType {
+LEGACY,
+LOW_LEVEL,
+HIGH_LEVEL
+  }

Review Comment:
   The reason to do so even if its internally used is to make sure any changes 
to this enum should be treated as API changes and would need to be treated as 
backward compatible as this will be used in the coordinator stream 
configuration which are persisted and read across different versions of samza.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on pull request #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-08 Thread GitBox


mynameborat commented on PR #1641:
URL: https://github.com/apache/samza/pull/1641#issuecomment-1307762089

   Is the default support to prevent recursion not available when this change 
was introduced or did we upgrade to newer version of log4j that adds this 
prevention at the framework layer?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] jia-gao opened a new pull request, #1641: LISAMZA-27395 removing the current recursive call prevention logic

2022-11-08 Thread GitBox


jia-gao opened a new pull request, #1641:
URL: https://github.com/apache/samza/pull/1641

   Issue: 
   We have observed logging data loss happen in StreamAppender
   The RC is that current implementation of the 
“[append()](https://github.com/apache/samza/blob/master/samza-log4j/src/main/java/org/apache/samza/logging/log4j/StreamAppender.java#L146)”
 method uses one AtomicBoolean variable to detect if the thread is called 
recursively. 
   
   However, it won’t work as expected since Log4j2 framework already applies 
recursive call prevention and it has the side effect that data loss might occur 
in the parallel logging scenario
   
   Change:
   Remove current recursive call prevention logic, Leverage [log4j2 
framework](https://github.com/apache/logging-log4j2/blob/release-2.x/log4j-core/src/main/java/org/apache/logging/log4j/core/config/AppenderControl.java#L122)
 to prevent recursive calls
   
   API Changes:
   Remove a metric recursivecall from StreamAppenderMetrics since it is no 
longer used
   
   Test Done:
   ./gradlew build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on a diff in pull request #1640: SAMZA-2741: [Pipeline Drain] Fix ApplicationUtil.isHighLevelApiJob to work for anonymous and lambda `SamzaApplication` classes

2022-11-07 Thread GitBox


mynameborat commented on code in PR #1640:
URL: https://github.com/apache/samza/pull/1640#discussion_r1016065159


##
samza-core/src/main/java/org/apache/samza/container/RunLoopFactory.java:
##
@@ -69,6 +69,8 @@ public static Runnable 
createRunLoop(scala.collection.immutable.Map

[GitHub] [samza] ajothomas opened a new pull request, #1640: SAMZA-2741: [Pipeline Drain] Fix ApplicationUtil.isHighLevelApiJob to work for anonymous and lambda `SamzaApplication` classes

2022-11-07 Thread GitBox


ajothomas opened a new pull request, #1640:
URL: https://github.com/apache/samza/pull/1640

   # Summary
   `ApplicationUtil.isHighLevelApiJob` doesn't work for cases where 
`SamzaApplication` could have been created as an anonymous class and lambdas
   
   # Cause
   Anonymous class names and lambdas classes cannot be created using `Class 
forName()`. As a result, the current logic currently assumes that the class 
name was incorrect. 
   
   # Fix 
   Introduce an internal config `app.api.type` and we set the config using 
`AppDescriptor` in `JobPlanner`.
   
   # Test
   - Unit test for `ApplicationUtil.isHighLevelApiJob`
   - Tested with a sample application


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat merged pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


mynameborat merged PR #1636:
URL: https://github.com/apache/samza/pull/1636


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on PR #1636:
URL: https://github.com/apache/samza/pull/1636#issuecomment-1306372678

   @mynameborat I have fixed the check issue(checksytle issue) and resolved all 
the comments. Can you please check? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015983431


##
samza-core/src/main/java/org/apache/samza/container/host/ProcessCPUStatistics.java:
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import java.util.Objects;
+
+
+/**
+ * A {@link ProcessCPUStatistics} object represents recent CPU usage 
percentage about the container process(including its child processes)
+ */
+public class ProcessCPUStatistics {

Review Comment:
   This is for the consistency with the memory metric implementation: 
`SystemMemoryStatistics`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


mynameborat commented on PR #1636:
URL: https://github.com/apache/samza/pull/1636#issuecomment-1306336433

   Looks good to me mostly. Can you close out the conversations that have been 
resolved and respond to the comments that are pending.
   
   Also, fix the latest checks so that I can merge them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


mynameborat commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015978318


##
samza-core/src/main/java/org/apache/samza/container/host/ProcessCPUStatistics.java:
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import java.util.Objects;
+
+
+/**
+ * A {@link ProcessCPUStatistics} object represents recent CPU usage 
percentage about the container process(including its child processes)
+ */
+public class ProcessCPUStatistics {

Review Comment:
   as in why not use primitive and have a class to represent CPU statistics? Do 
you plan to expand the PCU stats?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015958371


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();

Review Comment:
   Yes, we can. I can update it and return null for consistent with 
`PosxCommandBasedStatsGetter ` implementation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015949373


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   Oh, I see the gap here: Your point is that host CPU num != JVM available CPU 
num, and I thought they were the same. I think your point is correct in some 
scenarios. 
   
   I did some research/test on the `OperatingSystemMXBean. 
getAvailableProcessors()`, the count is for logical CPU count instead of 
phsical CPU count. For non-containerization or VM use cases, the available 
logic/phsical CPU count in JVM could be the same as in host, IIUC. I will 
update this oshi implementation based on logical CPU count anyway since we 
should be consistent with JVM. 
   
   Note: I didn't verify if oshi shows the consistent behavior/result regarding 
the CPU count in the containerizaiton or VM land(e.g. docker, kubernetes).
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015949373


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   Oh, I see the gap here: Your point is that host CPU num != JVM available CPU 
num, and I thought they were the same. I think your point is correct in some 
scenarios. 
   
   I did some research/test on the `OperatingSystemMXBean. 
getAvailableProcessors()`, the count is for logical CPU count instead of 
phsical CPU count. For non-containerization or VM use cases, the available 
logic/phsical CPU count in JVM should be same as in host, IIUC. I will update 
this oshi implementation based on logical CPU count anyway since we should be 
consistent with JVM. 
   
   Note: I didn't verify if oshi shows the consistent behavior/result regarding 
the CPU count in the containerizaiton or VM land(e.g. docker, kubernetes).
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015949373


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   Oh, I see the gap here: Your point is that host CPU num != JVM available CPU 
num, and I thought they were the same. I think your point is correct. 
   
   I did some research/test on the `OperatingSystemMXBean. 
getAvailableProcessors()`, the count is for logical CPU count instead of 
phsical CPU count. For non-containerization or VM use cases, the available 
logic/phsical CPU count in JVM should be same as in host, IIUC. I will update 
this oshi implementation based on logical CPU count anyway since we should be 
consistent with JVM. 
   
   Note: I didn't verify if oshi shows the consistent behavior/result regarding 
the CPU count in the containerizaiton or VM land(e.g. docker, kubernetes).
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015949373


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   Oh, I see the gap here: Your point is that host CPU num != JVM available CPU 
num, and I thought they were the same. I think your point is correct. 
   
   I did some research/test on the `OperatingSystemMXBean. 
getAvailableProcessors()`, the count is for logical CPU count instead of 
phsical CPU count. For non-containerization or VM use cases, the available 
logic/phsical CPU count in JVM should be same with host, IIUC. I will update 
this oshi implementation based on logical CPU count anyway since we should be 
consistent with JVM. 
   
   Note: I didn't verify if oshi shows the consistent behavior/result regarding 
the CPU count in the containerizaiton or VM land(e.g. docker, kubernetes).
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] lakshmi-manasa-g commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


lakshmi-manasa-g commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015754291


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   no, i am saying that the current samza process-cpu usage does NOT take into 
account the num cpu cores on the host. 
   
   OperatingSystemMXBean is actually giving out the fraction of the total 
available cpu that the jvm process is using. hence it is between [0,1] AND NOT 
the actual cpu tick count which would need to take into account cpu core count
   
   but the proposed change in the pr takes into account num cores and gives the 
actual cpu tick count 
   



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public 

[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015751569


##
samza-core/src/main/java/org/apache/samza/container/host/ProcessCPUStatistics.java:
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import java.util.Objects;
+
+
+/**
+ * A {@link ProcessCPUStatistics} object represents recent CPU usage 
percentage about the container process(including its child processes)
+ */
+public class ProcessCPUStatistics {

Review Comment:
   Can you please clarify `wrap this into a container object`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015714131


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();

Review Comment:
   >but in general, an interface impl should have a clear contract instead of 
bubbling up exceptions that are internal to its impl and except the caller/user 
of the interface to deal with it.
   
   Make sense. But in this case, I didn't see any checked/unchecked/errors that 
can be thrown in the current oshi implementation.  Will you still suggest that 
we should catch all Exceptions explicitly or throw Samza its own exception?



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  

[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015713429


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   Yeah, your understanding is the same as mine. So you are good with this PR's 
implementation, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] lakshmi-manasa-g commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


lakshmi-manasa-g commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015642158


##
samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsGetter.java:
##
@@ -30,4 +30,11 @@ public interface SystemStatisticsGetter {
* @return {@link SystemMemoryStatistics} for the Samza container
*/
   SystemMemoryStatistics getSystemMemoryStatistics();
+
+  /**
+   * Returns the {@link ProcessCPUStatistics} for the current Samza container 
process(includes its child processes)
+   *
+   * @return {@link ProcessCPUStatistics} for the Samza container process
+   */
+  ProcessCPUStatistics getProcessCPUStatistics();

Review Comment:
   got it. that makes sense to me.



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);
+  }
+
+  private List getCurrentProcessAndChildProcesses() {
+final List processes = new ArrayList<>();
+// get current process
+processes.add(os.getProcess(os.getProcessId()));
+// get all child processes of current process
+processes.addAll(os.getChildProcesses(os.getProcessId(), 
OperatingSystem.ProcessFiltering.ALL_PROCESSES,
+OperatingSystem.ProcessSorting.NO_SORTING, 0));
+return processes;
+  }
+
+  private double getTotalCPUUsage(List processes) {
+return processes.stream()
+.mapToDouble(p -> 
p.getProcessCpuLoadBetweenTicks(previousProcessSnapshots.get(p.getProcessID(

Review Comment:
   actually this brings out an important point:
   the impl is 
   ```
   public ProcessCPUStatistics getProcessCPUStatistics() {
   final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
   final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
   refreshProcessSnapshots(currentProcessAndChildProcesses);
   return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);
   ```
   so the first time this is called, `previousProcessSnapshots` is an empty 
hash map right. so `get` from it will return null. seems in this case the cpu 
usage returned cumulative in that case. which seems okay. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact 

[GitHub] [samza] lakshmi-manasa-g commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-07 Thread GitBox


lakshmi-manasa-g commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1015641830


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   from the desc above, it seems the OperatingSystemMXBean is actually giving 
out the **fraction** of the total available cpu that the jvm process is using. 
hence it is between [0,1]. My understanding is based on the following excerpt 
pasted above
   
   >A value of 0.0 means that none of the CPUs were running threads from the 
JVM process during the recent period of time observed, while a value of 1.0 
means that all CPUs were actively running threads from the JVM 100% of the time 
during the recent period being observed.
   
   this to me indicates fraction of the num CPUs being used by jvm process. 
   Also matches the way `getProccessCpuLoad` is used within Samza's JVMMetrics
   
   ```
 val processCpuLoad = operatingSystemMXBean.getProcessCpuLoad
 gProcessCpuUsage.set(processCpuLoad * PCT)
 gProcessCpuUsageProcessors.set(processCpuLoad * 
operatingSystemMXBean.getAvailableProcessors)
 gSystemCpuUsage.set(operatingSystemMXBean.getSystemCpuLoad * PCT)
   ```
   
   Note how we multiply processCpuLoad with 100 to get the percent of total cpu 
being used by the process. 
   
   
   
   



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import 

[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014527935


##
samza-core/src/main/java/org/apache/samza/container/host/PosixCommandBasedStatisticsGetter.java:
##
@@ -85,4 +86,10 @@ public SystemMemoryStatistics getSystemMemoryStatistics() {
   return null;
 }
   }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+throw new NotImplementedException(
+"No appropriate Posix command available for getting recent CPU usage 
information. For example, the CPU information exposed by ps command 'ps -o 
%cpu= -p ' represents the percentage of time spent running during the 
entire lifetime of a process not for the recent CPU usage");
+  }

Review Comment:
   Yes, `top` can give the instant/recent CPU usage information for a given 
process, but it has below cons so that I didn't choose it:
   
   - Need to have operating system specific implementation, since the top 
command is not a POSIX command and its input and output format in Linux and 
macOS are different.
   - Slower than ps/oshi command. Especially, on macOS the top command needs to 
get at least two samples to get the right CPU usage, which takes longer time(~ 
2 seconds) than 1 sample.
 - not clear the reason, but it shows the consistent behavior that one 
sample(-l1) always return 0.0 on my macOS
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014528695


##
samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala:
##
@@ -49,6 +49,7 @@ class SamzaContainerMetrics(
   val executorWorkFactor = newGauge("executor-work-factor", 1.0)
   val physicalMemoryMb = newGauge("physical-memory-mb", 0.0F)
   val physicalMemoryUtilization = newGauge("physical-memory-utilization", 0.0F)
+  val totalProcessCpuUsage = newGauge("total-process-cpu-usage", 0.0)

Review Comment:
   Prefer to create a new metric `total-process-cpu-usage` to represent the cpu 
usage for the Samza container’s JVM process and all its child/descendant 
processes because of
   - I view it as a non-backward compatible change if let the existing 
“process-cpu-usage” metric includes child/descendant processes especially for 
non-JVM processes.
 - Samza official doc clearly states that the `process-cpu-usage` returns 
the “recent cpu usage” for the Java Virtual Machine process: 
https://samza.apache.org/learn/documentation/1.6.0/operations/monitoring.html
   - Having a new metric could be consistent with how Samza exposes 
`physical-memory-mb`. This memory related metric which includes child process 
is available for Samza container only, the AM doesn’t have it.
   - Implementation/Maintainance complexity. If keep using the existing 
`process-cpu-usage` we will have to deprecate the JVMMetrics implementation and 
add a new hook into AM to expose the metric.
   
   Good point on the doc, let me update the document.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014528695


##
samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala:
##
@@ -49,6 +49,7 @@ class SamzaContainerMetrics(
   val executorWorkFactor = newGauge("executor-work-factor", 1.0)
   val physicalMemoryMb = newGauge("physical-memory-mb", 0.0F)
   val physicalMemoryUtilization = newGauge("physical-memory-utilization", 0.0F)
+  val totalProcessCpuUsage = newGauge("total-process-cpu-usage", 0.0)

Review Comment:
   Prefer to create a new metric(e.g. total-process-cpu-usage) to represent the 
cpu usage for the Samza container’s JVM process and all its child/descendant 
processes because of
   - I view it as a non-backward compatible change if let the existing 
“process-cpu-usage” metric to include child/descendant processes especially for 
non-JVM processes.
 - Samza official doc clearly states that the `process-cpu-usage` returns 
the “recent cpu usage” for the Java Virtual Machine process: 
https://samza.apache.org/learn/documentation/1.6.0/operations/monitoring.html
   - Having a new metric could be consistent with how Samza exposes 
physical-memory-mb . This memory related metric which includes child process is 
available for Samza container only, the AM doesn’t have it.
   - Implementation/Maintainance complexity. If keep using the existing 
`process-cpu-usage` we will have to deprecate the JVMMetrics implementation and 
add a new hook into AM to expose the metric.
   
   Good point on the doc, let me update the document.



##
samza-core/src/main/scala/org/apache/samza/container/SamzaContainerMetrics.scala:
##
@@ -49,6 +49,7 @@ class SamzaContainerMetrics(
   val executorWorkFactor = newGauge("executor-work-factor", 1.0)
   val physicalMemoryMb = newGauge("physical-memory-mb", 0.0F)
   val physicalMemoryUtilization = newGauge("physical-memory-utilization", 0.0F)
+  val totalProcessCpuUsage = newGauge("total-process-cpu-usage", 0.0)

Review Comment:
   Prefer to create a new metric(e.g. total-process-cpu-usage) to represent the 
cpu usage for the Samza container’s JVM process and all its child/descendant 
processes because of
   - I view it as a non-backward compatible change if let the existing 
“process-cpu-usage” metric to include child/descendant processes especially for 
non-JVM processes.
 - Samza official doc clearly states that the `process-cpu-usage` returns 
the “recent cpu usage” for the Java Virtual Machine process: 
https://samza.apache.org/learn/documentation/1.6.0/operations/monitoring.html
   - Having a new metric could be consistent with how Samza exposes 
`physical-memory-mb`. This memory related metric which includes child process 
is available for Samza container only, the AM doesn’t have it.
   - Implementation/Maintainance complexity. If keep using the existing 
`process-cpu-usage` we will have to deprecate the JVMMetrics implementation and 
add a new hook into AM to expose the metric.
   
   Good point on the doc, let me update the document.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014527935


##
samza-core/src/main/java/org/apache/samza/container/host/PosixCommandBasedStatisticsGetter.java:
##
@@ -85,4 +86,10 @@ public SystemMemoryStatistics getSystemMemoryStatistics() {
   return null;
 }
   }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+throw new NotImplementedException(
+"No appropriate Posix command available for getting recent CPU usage 
information. For example, the CPU information exposed by ps command 'ps -o 
%cpu= -p ' represents the percentage of time spent running during the 
entire lifetime of a process not for the recent CPU usage");
+  }

Review Comment:
   Yes, `top` can give the instant/recent CPU usage information for a given 
process, but it has below cons so that I didn't choose it:
   
   - Need to have operating system specific implementation, since the top 
command is not a POSIX command and its input and output format in Linux and 
macOS are different.
   - Slower than ps command. Especially, on macOS the top command needs to get 
at least two samples to get the right CPU usage, which takes longer time(~ 2 
seconds) than 1 sample.
 - not clear the reason, but it shows the consistent behavior that one 
sample(-l1) always return 0.0 on my macOS
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014527343


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);
+  }
+
+  private List getCurrentProcessAndChildProcesses() {
+final List processes = new ArrayList<>();
+// get current process
+processes.add(os.getProcess(os.getProcessId()));
+// get all child processes of current process
+processes.addAll(os.getChildProcesses(os.getProcessId(), 
OperatingSystem.ProcessFiltering.ALL_PROCESSES,
+OperatingSystem.ProcessSorting.NO_SORTING, 0));
+return processes;
+  }
+
+  private double getTotalCPUUsage(List processes) {
+return processes.stream()
+.mapToDouble(p -> 
p.getProcessCpuLoadBetweenTicks(previousProcessSnapshots.get(p.getProcessID(

Review Comment:
   No, it won't: 
https://github.com/oshi/oshi/blob/master/oshi-core/src/main/java/oshi/software/common/AbstractOSProcess.java#L52
   
   ```java
   @Override
   public double getProcessCpuLoadBetweenTicks(OSProcess priorSnapshot) {
   if (priorSnapshot != null && this.processID == 
priorSnapshot.getProcessID()
   && getUpTime() > priorSnapshot.getUpTime()) {
   return (getUserTime() - priorSnapshot.getUserTime() + 
getKernelTime() - priorSnapshot.getKernelTime())
   / (double) (getUpTime() - priorSnapshot.getUpTime());
   }
   return getProcessCpuLoadCumulative();
   }
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014525707


##
samza-core/src/main/java/org/apache/samza/container/host/SystemStatisticsGetter.java:
##
@@ -30,4 +30,11 @@ public interface SystemStatisticsGetter {
* @return {@link SystemMemoryStatistics} for the Samza container
*/
   SystemMemoryStatistics getSystemMemoryStatistics();
+
+  /**
+   * Returns the {@link ProcessCPUStatistics} for the current Samza container 
process(includes its child processes)
+   *
+   * @return {@link ProcessCPUStatistics} for the Samza container process
+   */
+  ProcessCPUStatistics getProcessCPUStatistics();

Review Comment:
   Exactly that's the point why I decided to use `ProcessCPUStats` which is 
trying to avoid confusion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014525560


##
samza-core/src/main/java/org/apache/samza/container/host/SystemStatistics.java:
##
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import java.util.Objects;
+
+/**
+ * A {@link SystemStatistics} object represents system related information 
about the physical process that runs the
+ * {@link org.apache.samza.container.SamzaContainer}.
+ */
+public class SystemStatistics {
+
+  private final ProcessCPUStatistics cpuStatistics;
+  private final SystemMemoryStatistics memoryStatistics;
+
+  public SystemStatistics(ProcessCPUStatistics cpuStatistics, 
SystemMemoryStatistics memoryStatistics) {

Review Comment:
   The actual logic to handle the null is inside the Listener implementaiton: 
https://github.com/apache/samza/pull/1636/files#diff-f79781ad4c55ae7860829b06fd9dfd15e8069c37e64f8854d8f27ca2cd1f3ee5R653



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014525279


##
samza-core/src/main/java/org/apache/samza/container/host/StatisticsMonitorImpl.java:
##
@@ -117,23 +118,23 @@ public void run() {
   }
 
   private void sampleStatistics() {
-SystemMemoryStatistics statistics = null;
+SystemMemoryStatistics memoryStatistics = null;
+ProcessCPUStatistics cpuStatistics = null;
 try {
-  statistics = statisticsGetter.getSystemMemoryStatistics();
+  memoryStatistics = statisticsGetter.getSystemMemoryStatistics();
+  cpuStatistics = statisticsGetter.getProcessCPUStatistics();
 } catch (Throwable e) {
   LOG.error("Error during obtaining statistics: ", e);
 }
-
+SystemStatistics systemStatistics = new SystemStatistics(cpuStatistics, 
memoryStatistics);
 for (Listener listener : listenerSet.keySet()) {
-  if (statistics != null) {

Review Comment:
   I moved the null check in the listener: 
https://github.com/apache/samza/pull/1636/files#diff-f79781ad4c55ae7860829b06fd9dfd15e8069c37e64f8854d8f27ca2cd1f3ee5R653
 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014524771


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();

Review Comment:
   There is already a try-catch block to catch all the exceptions: 
https://github.com/apache/samza/pull/1636/files#diff-3b72fed01e1d344575c2eaf6dd353a69ae1fcb8c304c0ebb7543ab23ded6ad9fR126,
 it seems not necessary to do it here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014523821


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();

Review Comment:
   The child processes might be different not only for start-up but also for 
runtime, think about a case that some child processes are died because of 
exceptions or they are completed.
   
   I think we'd better get the real-time information.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014523336


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();

Review Comment:
   My understanding is that the current Samza implementation only runs one 
thread to trigger all SystemStatisticsGetter implementations, so there won't be 
any race conditions. And in general, for such container/system/global metric, 
it doesn't make sense that we run multiple threads to collect data points.
   
   If we update the class as NotThreadSafe as you suggested in the first 
comment, will it help?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


alnzng commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014522272


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw new NotImplementedException("Not implemented");
+  }
+
+  @Override
+  public ProcessCPUStatistics getProcessCPUStatistics() {
+final List currentProcessAndChildProcesses = 
getCurrentProcessAndChildProcesses();
+final double totalCPUUsage = 
getTotalCPUUsage(currentProcessAndChildProcesses);
+refreshProcessSnapshots(currentProcessAndChildProcesses);
+return new ProcessCPUStatistics(100d * totalCPUUsage / cpuCount);

Review Comment:
   The existing CPU reporting actually takes into account the core-count on the 
host, right? Because it leverages on Java function `OperatingSystemMXBean. 
getProcessCpuLoad ()` whose value is a double in the [0.0,1.0] interval. If 
not, the value could be larger than 1.0.
   
   
https://docs.oracle.com/javase/8/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html#getProcessCpuTime--
   
   >Returns the "recent cpu usage" for the Java Virtual Machine process. This 
value is a double in the [0.0,1.0] interval. A value of 0.0 means that none of 
the CPUs were running threads from the JVM process during the recent period of 
time observed, while a value of 1.0 means that all CPUs were actively running 
threads from the JVM 100% of the time during the recent period being observed. 
Threads from the JVM include the application threads as well as the JVM 
internal threads. All values betweens 0.0 and 1.0 are possible depending of the 
activities going on in the JVM process and the whole system. If the Java 
Virtual Machine recent CPU usage is not available, the method returns a 
negative value.
   



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import 

[GitHub] [samza] mynameborat commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


mynameborat commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1013440757


##
samza-core/src/main/java/org/apache/samza/container/host/DefaultSystemStatisticsGetter.java:
##
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+
+
+/**
+ * An default implementation of {@link SystemStatisticsGetter} that relies on 
{@link PosixCommandBasedStatisticsGetter}
+ * and {@link OshiBasedStatisticsGetter} implementations
+ */
+public class DefaultSystemStatisticsGetter implements SystemStatisticsGetter {
+

Review Comment:
   nit: remove extra line



##
samza-core/src/main/java/org/apache/samza/container/host/ProcessCPUStatistics.java:
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import java.util.Objects;
+
+
+/**
+ * A {@link ProcessCPUStatistics} object represents recent CPU usage 
percentage about the container process(including its child processes)
+ */
+public class ProcessCPUStatistics {

Review Comment:
   why do we need to wrap this into a container object? 



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();
+
+  private final OperatingSystem os;
+  private final int cpuCount;
+
+  public OshiBasedStatisticsGetter() {
+this(new SystemInfo());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(SystemInfo si) {
+this(si.getOperatingSystem(), 
si.getHardware().getProcessor().getPhysicalProcessorCount());
+  }
+
+  @VisibleForTesting
+  OshiBasedStatisticsGetter(OperatingSystem os, int cpuCount) {
+this.os = os;
+this.cpuCount = cpuCount;
+  }
+
+  @Override
+  public SystemMemoryStatistics getSystemMemoryStatistics() {
+throw 

[GitHub] [samza] lakshmi-manasa-g commented on a diff in pull request #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-11-04 Thread GitBox


lakshmi-manasa-g commented on code in PR #1636:
URL: https://github.com/apache/samza/pull/1636#discussion_r1014326963


##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)

Review Comment:
   would be good to more docs on how the Stat getter works - aka previous 
snapshots, 
   we should also call out if the impl is thread safe and if not, clarify why 
its not a problem



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import org.apache.commons.lang3.NotImplementedException;
+import oshi.SystemInfo;
+import oshi.software.os.OSProcess;
+import oshi.software.os.OperatingSystem;
+
+
+/**
+ * An implementation of {@link SystemStatisticsGetter} that relies on using 
oshi framework(https://www.oshi.ooo/)
+ */
+public class OshiBasedStatisticsGetter implements SystemStatisticsGetter {
+  // the snapshots of current JVM process and its child processes
+  private final Map previousProcessSnapshots = new 
HashMap<>();

Review Comment:
   should we use ConcurrentHashMap?



##
samza-core/src/main/java/org/apache/samza/container/host/ProcessCPUStatistics.java:
##
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.samza.container.host;
+
+import java.util.Objects;
+
+
+/**
+ * A {@link ProcessCPUStatistics} object represents recent CPU usage 
percentage about the container process(including its child processes)

Review Comment:
   small clarification might be good to say that the percentage is against 
total cpu of host



##
samza-core/src/main/java/org/apache/samza/container/host/OshiBasedStatisticsGetter.java:
##
@@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license 

[GitHub] [samza] alnzng commented on pull request #1638: SAMZA-2764: add the metric "container-active-threads" back

2022-11-04 Thread GitBox


alnzng commented on PR #1638:
URL: https://github.com/apache/samza/pull/1638#issuecomment-1303916282

   thanks for merging it @mynameborat 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat merged pull request #1638: SAMZA-2764: add the metric "container-active-threads" back

2022-11-03 Thread GitBox


mynameborat merged PR #1638:
URL: https://github.com/apache/samza/pull/1638


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on a diff in pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-11-03 Thread GitBox


mynameborat commented on code in PR #1637:
URL: https://github.com/apache/samza/pull/1637#discussion_r1013403151


##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -106,7 +110,10 @@ public void run() {
   callback.failure(new SamzaException(msg));
 }
   };
-  ScheduledFuture scheduledFuture = timer.schedule(timerTask, timeout, 
TimeUnit.MILLISECONDS);
+
+  final ScheduledFuture scheduledFuture = isDraining && 
(envelope.isDrain() || envelope.isWatermark())
+  ? timer.schedule(timerTask, drainCallbackTimeout, 
TimeUnit.MILLISECONDS)
+  : timer.schedule(timerTask, timeout, TimeUnit.MILLISECONDS);

Review Comment:
   Can we add unit tests to the addition to ensure we use the drainTimeout vs 
timeout appropriately?



##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -106,7 +110,10 @@ public void run() {
   callback.failure(new SamzaException(msg));
 }
   };
-  ScheduledFuture scheduledFuture = timer.schedule(timerTask, timeout, 
TimeUnit.MILLISECONDS);
+
+  final ScheduledFuture scheduledFuture = isDraining && 
(envelope.isDrain() || envelope.isWatermark())

Review Comment:
   why do we need both `isDraining` and `envelope.isDrain()`? What happens if 
the runloop has not propagated its intent but the envelope is draining? Will it 
be treated as regular message processing?
   
   



##
samza-core/src/main/java/org/apache/samza/config/TaskConfig.java:
##
@@ -90,6 +90,13 @@ public class TaskConfig extends MapConfig {
   // timeout for triggering a callback
   public static final String CALLBACK_TIMEOUT_MS = "task.callback.timeout.ms";
   static final long DEFAULT_CALLBACK_TIMEOUT_MS = -1L;
+
+  // timeout for triggering a callback during drain
+  public static final String DRAIN_CALLBACK_TIMEOUT_MS = 
"task.callback.drain.timeout.ms";
+
+  // default timeout for triggering a callback during drain
+  static final long DEFAULT_DRAIN_CALLBACK_TIMEOUT_MS = -1L;

Review Comment:
   What does -1L default denote? Wait forever in case of drain message flow 
stuck is it? 



##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -106,7 +110,10 @@ public void run() {
   callback.failure(new SamzaException(msg));
 }
   };
-  ScheduledFuture scheduledFuture = timer.schedule(timerTask, timeout, 
TimeUnit.MILLISECONDS);
+
+  final ScheduledFuture scheduledFuture = isDraining && 
(envelope.isDrain() || envelope.isWatermark())

Review Comment:
   Might be easier to drive this in one place as opposed to split across?
   e.g., Would adding a new method w/ `createCallback(..., timeout)` fit better 
here? That way you can keep all the logic of drain and watermark and where we 
are abstracted from this piece of code (ideally thats how it should be) and 
then have the caller determine the timeout based on its need.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] xinyuiscool merged pull request #1639: SAMZA-2741: [Pipeline Drain] Fix processing of Drain messages for High-Level and Low-level API

2022-11-03 Thread GitBox


xinyuiscool merged PR #1639:
URL: https://github.com/apache/samza/pull/1639


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1639: SAMZA-2741: [Pipeline Drain] Fix processing of Drain messages for High-Level and Low-level API

2022-11-03 Thread GitBox


ajothomas commented on code in PR #1639:
URL: https://github.com/apache/samza/pull/1639#discussion_r1013123821


##
samza-test/src/test/java/org/apache/samza/test/drain/DrainHighLevelApiIntegrationTest.java:
##
@@ -74,20 +74,30 @@ public void describe(StreamApplicationDescriptor 
appDescriptor) {
   .map(KV::getValue)
   .partitionBy(PageView::getMemberId, pv -> pv,
   KVSerde.of(new IntegerSerde(), new 
TestTableData.PageViewJsonSerde()), "p1")
+  .map(kv -> KV.of(kv.getKey() * 31, kv.getValue()))
+  .partitionBy(KV::getKey, KV::getValue, KVSerde.of(new 
IntegerSerde(), new TestTableData.PageViewJsonSerde()), "p2")
   .sink((m, collector, coordinator) -> {
 RECEIVED.add(m.getValue());
   });
 }
   }
 
-  // The test can be occasionally flaky, so we set Ignore annotation
-  // Remove ignore annotation and run the test as follows:
-  // ./gradlew :samza-test:test --tests 
org.apache.samza.test.drain.DrainHighLevelApiIntegrationTest -PscalaSuffix=2.12
+  /**
+   * This test will test drain and consumption of some messages from the 
in-memory topic.
+   * In order to simulate the real-world behaviour of drain, the test adds 
messages to the in-memory topic buffer periodically
+   * in a delayed fashion instead of all at once. The test then writes the 
drain notification message to the in-memory
+   * metadata store to drain and stop the pipeline. This write is done shortly 
after the pipeline starts and before all
+   * the messages are written to the topic's buffer. As a result, the total 
count of the processed messages will be less
+   * than the expected count of messages.
+   * */
   @Ignore
   @Test
-  public void testPipeline() {
+  public void testDrain() {

Review Comment:
   Tests have been turned on now after a few tweaks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1639: SAMZA-2741: [Pipeline Drain] Fix processing of Drain messages for High-Level and Low-level API

2022-11-03 Thread GitBox


ajothomas commented on code in PR #1639:
URL: https://github.com/apache/samza/pull/1639#discussion_r1013123456


##
samza-core/src/main/java/org/apache/samza/container/RunLoop.java:
##
@@ -875,49 +876,40 @@ private boolean shouldDrain() {
 return false;
   }
 
-  if (!pendingEnvelopeQueue.isEmpty()) {
-PendingEnvelope pendingEnvelope = pendingEnvelopeQueue.peek();
-IncomingMessageEnvelope envelope = pendingEnvelope.envelope;
+  if (pendingEnvelopeQueue.size() > 0) {
+final PendingEnvelope pendingEnvelope = pendingEnvelopeQueue.peek();
+final IncomingMessageEnvelope envelope = pendingEnvelope.envelope;
 
 if (envelope.isDrain()) {
   final DrainMessage message = (DrainMessage) envelope.getMessage();
   if (!message.getRunId().equals(runId)) {
-// Removing the drain message from the pending queue as it doesn't 
match with the current runId
-// Removing it will ensure that it is not picked up by process()
-pendingEnvelopeQueue.remove();
+// Removing the drain message from the pending queue as it doesn't 
match with the current deployment
+final PendingEnvelope discardedDrainMessage = 
pendingEnvelopeQueue.remove();
+
consumerMultiplexer.tryUpdate(discardedDrainMessage.envelope.getSystemStreamPartition());
   } else {
+// Found drain message matching the current deployment
+
 // set the RunLoop to drain mode
 if (!isDraining) {
   drain();
 }
 
-if (elasticityFactor <= 1) {
-  SystemStreamPartition ssp = envelope.getSystemStreamPartition();
-  processingSspSetToDrain.remove(ssp);
-} else {
-  // SystemConsumers will write only one envelope (enclosing 
DrainMessage) per SSP in its buffer.
-  // This envelope doesn't have keybucket info it's SSP. With 
elasticity, the same SSP can be processed by
-  // multiple tasks. Therefore, if envelope contains drain 
message, the ssp of envelope should be removed
-  // from task's processing set irrespective of keyBucket.
-  SystemStreamPartition sspOfEnvelope = 
envelope.getSystemStreamPartition();
-  Optional ssp = 
processingSspSetToDrain.stream()
-  .filter(sspInSet -> 
sspInSet.getSystemStream().equals(sspOfEnvelope.getSystemStream())
-  && 
sspInSet.getPartition().equals(sspOfEnvelope.getPartition()))
-  .findFirst();
-  ssp.ifPresent(processingSspSetToDrain::remove);
-}
 
 if (!hasIntermediateStreams) {
-  // Don't remove from the pending queue as we want the DAG to 
pick up Drain message and propagate it to
-  // intermediate streams
+  // The flow below only applies to samza low-level API
+
+  // For high-level API, we do not remove the message from pending 
queue.
+  // It will be picked by the process flow instead of drain flow, 
as we want the drain control message
+  // to be processed by the High-level API Operator DAG.
+
+  
processingSspSetToDrain.remove(envelope.getSystemStreamPartition());
   pendingEnvelopeQueue.remove();
+  return processingSspSetToDrain.isEmpty();

Review Comment:
   Changed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1639: SAMZA-2741: [Pipeline Drain] Fix processing of Drain messages for High-Level and Low-level API

2022-11-03 Thread GitBox


ajothomas commented on code in PR #1639:
URL: https://github.com/apache/samza/pull/1639#discussion_r1013123301


##
samza-core/src/main/java/org/apache/samza/container/RunLoop.java:
##
@@ -875,49 +876,40 @@ private boolean shouldDrain() {
 return false;
   }
 
-  if (!pendingEnvelopeQueue.isEmpty()) {
-PendingEnvelope pendingEnvelope = pendingEnvelopeQueue.peek();
-IncomingMessageEnvelope envelope = pendingEnvelope.envelope;
+  if (pendingEnvelopeQueue.size() > 0) {
+final PendingEnvelope pendingEnvelope = pendingEnvelopeQueue.peek();
+final IncomingMessageEnvelope envelope = pendingEnvelope.envelope;
 
 if (envelope.isDrain()) {
   final DrainMessage message = (DrainMessage) envelope.getMessage();
   if (!message.getRunId().equals(runId)) {
-// Removing the drain message from the pending queue as it doesn't 
match with the current runId
-// Removing it will ensure that it is not picked up by process()
-pendingEnvelopeQueue.remove();
+// Removing the drain message from the pending queue as it doesn't 
match with the current deployment
+final PendingEnvelope discardedDrainMessage = 
pendingEnvelopeQueue.remove();
+
consumerMultiplexer.tryUpdate(discardedDrainMessage.envelope.getSystemStreamPartition());
   } else {
+// Found drain message matching the current deployment
+
 // set the RunLoop to drain mode
 if (!isDraining) {
   drain();
 }
 
-if (elasticityFactor <= 1) {
-  SystemStreamPartition ssp = envelope.getSystemStreamPartition();
-  processingSspSetToDrain.remove(ssp);
-} else {
-  // SystemConsumers will write only one envelope (enclosing 
DrainMessage) per SSP in its buffer.
-  // This envelope doesn't have keybucket info it's SSP. With 
elasticity, the same SSP can be processed by
-  // multiple tasks. Therefore, if envelope contains drain 
message, the ssp of envelope should be removed
-  // from task's processing set irrespective of keyBucket.
-  SystemStreamPartition sspOfEnvelope = 
envelope.getSystemStreamPartition();
-  Optional ssp = 
processingSspSetToDrain.stream()
-  .filter(sspInSet -> 
sspInSet.getSystemStream().equals(sspOfEnvelope.getSystemStream())
-  && 
sspInSet.getPartition().equals(sspOfEnvelope.getPartition()))
-  .findFirst();
-  ssp.ifPresent(processingSspSetToDrain::remove);
-}
 
 if (!hasIntermediateStreams) {

Review Comment:
   Changed this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas commented on a diff in pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-11-02 Thread GitBox


ajothomas commented on code in PR #1637:
URL: https://github.com/apache/samza/pull/1637#discussion_r1012008633


##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -76,7 +76,7 @@ List update(TaskCallbackImpl cb) {
   private final TaskCallbacks completedCallbacks = new TaskCallbacks();
   private final ScheduledExecutorService timer;
   private final TaskCallbackListener listener;
-  private final long timeout;
+  private long timeout;

Review Comment:
   Fixed this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] xinyuiscool commented on a diff in pull request #1639: SAMZA-2741: [Pipeline Drain] Fix processing of Drain messages for High-Level and Low-level API

2022-11-01 Thread GitBox


xinyuiscool commented on code in PR #1639:
URL: https://github.com/apache/samza/pull/1639#discussion_r1010901435


##
samza-test/src/test/java/org/apache/samza/test/drain/DrainHighLevelApiIntegrationTest.java:
##
@@ -74,20 +74,30 @@ public void describe(StreamApplicationDescriptor 
appDescriptor) {
   .map(KV::getValue)
   .partitionBy(PageView::getMemberId, pv -> pv,
   KVSerde.of(new IntegerSerde(), new 
TestTableData.PageViewJsonSerde()), "p1")
+  .map(kv -> KV.of(kv.getKey() * 31, kv.getValue()))
+  .partitionBy(KV::getKey, KV::getValue, KVSerde.of(new 
IntegerSerde(), new TestTableData.PageViewJsonSerde()), "p2")
   .sink((m, collector, coordinator) -> {
 RECEIVED.add(m.getValue());
   });
 }
   }
 
-  // The test can be occasionally flaky, so we set Ignore annotation
-  // Remove ignore annotation and run the test as follows:
-  // ./gradlew :samza-test:test --tests 
org.apache.samza.test.drain.DrainHighLevelApiIntegrationTest -PscalaSuffix=2.12
+  /**
+   * This test will test drain and consumption of some messages from the 
in-memory topic.
+   * In order to simulate the real-world behaviour of drain, the test adds 
messages to the in-memory topic buffer periodically
+   * in a delayed fashion instead of all at once. The test then writes the 
drain notification message to the in-memory
+   * metadata store to drain and stop the pipeline. This write is done shortly 
after the pipeline starts and before all
+   * the messages are written to the topic's buffer. As a result, the total 
count of the processed messages will be less
+   * than the expected count of messages.
+   * */
   @Ignore
   @Test
-  public void testPipeline() {
+  public void testDrain() {

Review Comment:
   Seems most of the integration tests are ignored due to relying on wall-clock 
time. Is it possible to enable a couple and we don't need to do this timed 
wait? Not sure this TestRunner support waitForFinish(). but samza runners have 
this api so in theory we should be able to use it.



##
samza-core/src/main/java/org/apache/samza/container/RunLoop.java:
##
@@ -875,49 +876,40 @@ private boolean shouldDrain() {
 return false;
   }
 
-  if (!pendingEnvelopeQueue.isEmpty()) {
-PendingEnvelope pendingEnvelope = pendingEnvelopeQueue.peek();
-IncomingMessageEnvelope envelope = pendingEnvelope.envelope;
+  if (pendingEnvelopeQueue.size() > 0) {
+final PendingEnvelope pendingEnvelope = pendingEnvelopeQueue.peek();
+final IncomingMessageEnvelope envelope = pendingEnvelope.envelope;
 
 if (envelope.isDrain()) {
   final DrainMessage message = (DrainMessage) envelope.getMessage();
   if (!message.getRunId().equals(runId)) {
-// Removing the drain message from the pending queue as it doesn't 
match with the current runId
-// Removing it will ensure that it is not picked up by process()
-pendingEnvelopeQueue.remove();
+// Removing the drain message from the pending queue as it doesn't 
match with the current deployment
+final PendingEnvelope discardedDrainMessage = 
pendingEnvelopeQueue.remove();
+
consumerMultiplexer.tryUpdate(discardedDrainMessage.envelope.getSystemStreamPartition());
   } else {
+// Found drain message matching the current deployment
+
 // set the RunLoop to drain mode
 if (!isDraining) {
   drain();
 }
 
-if (elasticityFactor <= 1) {
-  SystemStreamPartition ssp = envelope.getSystemStreamPartition();
-  processingSspSetToDrain.remove(ssp);
-} else {
-  // SystemConsumers will write only one envelope (enclosing 
DrainMessage) per SSP in its buffer.
-  // This envelope doesn't have keybucket info it's SSP. With 
elasticity, the same SSP can be processed by
-  // multiple tasks. Therefore, if envelope contains drain 
message, the ssp of envelope should be removed
-  // from task's processing set irrespective of keyBucket.
-  SystemStreamPartition sspOfEnvelope = 
envelope.getSystemStreamPartition();
-  Optional ssp = 
processingSspSetToDrain.stream()
-  .filter(sspInSet -> 
sspInSet.getSystemStream().equals(sspOfEnvelope.getSystemStream())
-  && 
sspInSet.getPartition().equals(sspOfEnvelope.getPartition()))
-  .findFirst();
-  ssp.ifPresent(processingSspSetToDrain::remove);
-}
 
 if (!hasIntermediateStreams) {
-  // Don't remove from the pending queue as we want the DAG to 
pick up Drain message and propagate it to
-  // intermediate streams
+  // The flow below only applies to samza low-level API
+
+  // For high-level API, we do not remove the 

[GitHub] [samza] ajothomas opened a new pull request, #1639: SAMZA-2741: Pipeline Drain- Fix processing of Drain messages for High-Level and Low-level API

2022-11-01 Thread GitBox


ajothomas opened a new pull request, #1639:
URL: https://github.com/apache/samza/pull/1639

   # Symptoms and Cause:
   There were a few issues with the processing of drain messages that this PR 
attempts to fix. The following issues were encountered during the end-to-end 
testing with test pipelines.
   
   1. It was observed that the `RunLoop` was running into an exception when the 
tasks were trying to consume from the intermediate stream.
   This was happening as we stop all consumers when we encounter drain. This 
also stops the consumers which ingest data from the intermediate streams. 
   
   After 1 was resolved, some additional issues were observed-
   2. The pipeline was getting stuck after partial consumption of data after 
the last shuffle stage. 
   
   This was happening as we were removing the SSPs from the processing set of 
SSPs for a task for high-level API processing. This was leading to unprocessed 
drain messages and thereby the pipeline would remain stuck.
   
   3. The pipeline was stuck if a drain message was encountered prior to the 
start of the pipeline. This is happening as we set the `SystemConsumers` to 
drain mode before it is even started by `SamzaContainer`
   
   # Changes:
   1. Changed the `SystemConsumers` code to remove the line which stopped 
registered `SystemConsumer`s on drain. 
   
   2. Restrict the logic to remove the SSP from the processing SSP set only to 
high-level API. Additionally, ask the `TaskCoordinator` to commit the task once 
all streams for a task have drained.
   
   3. To fix the issue of pipeline getting stuck if a drain control message was 
present prior to container start, we write drain and watermark control messages 
on `SystemConsumers` start if it is in drain mode.
   
   Additionally, Drain related integration tests 
(`DrainLowLevelApiIntegrationTest` and `DrainHighLevelApiIntegrationTest`) have 
been enabled as most of the flakiness and edge-cases have been fixed now. There 
is a small chance that both the tests can be flaky on rare occasions as 
`TestRunner.run` and in-memory metadata store writes are happening in separate 
threads. Despite adding a generous 5 seconds delay for metadata store writes, 
it is possible for the order to get mixed up which causes the test to fail. 
Added a test rule `RetryRule` to attempt a retry if the tests fail the first 
time.
   
   # Tests:
   - End to end tests for both high-level and low-level API
   - Unit tests for Drain in RunLoop
   - Integ test for Drain for High-level and Low-Level API- 
DrainLowLevelApiIntegrationTest & DrainHighLevelApiIntegrationTest
   
   # API changes:
   None
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng opened a new pull request, #1638: SAMZA-2764: add the metric "container-active-threads" back

2022-11-01 Thread GitBox


alnzng opened a new pull request, #1638:
URL: https://github.com/apache/samza/pull/1638

   # Symptom
   
   No data was emitted for the metric `container-active-threads`
   
   # Cause
   
   This PR(#1501 ) removed the logic to emit data for metric accidently: 
https://github.com/apache/samza/pull/1501/files#diff-f79781ad4c55ae7860829b06fd9dfd15e8069c37e64f8854d8f27ca2cd1f3ee5L637
 
   
   # Changes
   
   - Add the deleted logic back in a new class`SamzaContainerMonitorListener` 
   
   # Tests
   
   - Unit tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on a diff in pull request #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-10-31 Thread GitBox


mynameborat commented on code in PR #1637:
URL: https://github.com/apache/samza/pull/1637#discussion_r1010012333


##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -76,7 +76,7 @@ List update(TaskCallbackImpl cb) {
   private final TaskCallbacks completedCallbacks = new TaskCallbacks();
   private final ScheduledExecutorService timer;
   private final TaskCallbackListener listener;
-  private final long timeout;
+  private long timeout;

Review Comment:
   In general, its anti pattern to override instance variables post 
construction. We should probably find another way to enforce timeout for drain.



##
samza-core/src/main/java/org/apache/samza/config/TaskConfig.java:
##
@@ -90,6 +90,13 @@ public class TaskConfig extends MapConfig {
   // timeout for triggering a callback
   public static final String CALLBACK_TIMEOUT_MS = "task.callback.timeout.ms";
   static final long DEFAULT_CALLBACK_TIMEOUT_MS = -1L;
+
+  // timeout for triggering a callback during drain
+  public static final String DRAIN_CALLBACK_TIMEOUT_MS = 
"task.callback.drain.timeout.ms";
+
+  // default timeout for triggering a callback during drain
+  static final long DEFAULT_DRAIN_CALLBACK_TIMEOUT_MS = 
Duration.ofMinutes(30).toMillis();

Review Comment:
   The default seems too high for applications to get notified in case of 
faults. What is the rationale behind choosing such a high value? Is there a 
systematic way to compute the value here? 
   
   Like what are the downstream dependencies of this function and what SLAs to 
account for when computing the value for this. Document all of these as part of 
the configuration documentation as well.



##
samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java:
##
@@ -127,4 +127,14 @@ public List 
updateCallback(TaskCallbackImpl callback) {
   return ImmutableList.of(callback);
 }
   }
+
+  /**
+   * Override the timeout set in the callback manager with the given new 
timeout.
+   * This is intended to be used with pipeline drain as we want to override 
the existing timeout with a higher timeout.
+   *
+   * @param timeout new timeout for process callbacks
+   * */
+  public void updateTaskCallbackTimeout(long timeout) {

Review Comment:
   Why are we overriding the existing timeout set as part of the construction 
of the class? This is confusing and breaks the timeout originally configured 
that gets passed to the callback manager (which is `task.callback.timeout.ms`).
   
   With this change, we are changing user behavior with setting the default 
message processing timeout latency to 30 minutes.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] ajothomas opened a new pull request, #1637: SAMZA-2765: [Pipeline Drain] Adding config for task callback timeout during drain

2022-10-31 Thread GitBox


ajothomas opened a new pull request, #1637:
URL: https://github.com/apache/samza/pull/1637

   # Improvement:
   
   This PR is a part of the Pipeline Drain work and aims to provide a way to 
specify a task callback timeout to be used for pipeline drain. The standard 
`task.callback.timeout.ms` parameter might be too small for the drain operation 
to complete. Drain operation will involve clearing of intermediate state which 
might require a larger timeout for the task callback. We need a way to provide 
a configurable timeout to be used during drain operation.
   
   # Changes:
   - Add a new config `task.callback.drain.timeout.ms` in `TaskConfig`
   - Added logic to override `TaskCallbackManager`'s timeout parameter in 
`TaskWorker`s on drain.
   
   # Tests:
   
   - Unit test changes for `TaskConfig`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] Treydone commented on pull request #1633: Add GraalSystems in "powered-by"

2022-10-31 Thread GitBox


Treydone commented on PR #1633:
URL: https://github.com/apache/samza/pull/1633#issuecomment-1297695724

   Sorry, I can't build the site locally... Even from the master branch with 
the command `bundle exec jekyll build,` I got an error 
   
   ```
   jekyll 3.4.5 | Error:  No header received back.
   Traceback (most recent call last):
   49: from /usr/local/bin/jekyll:23:in `'
   48: from /usr/local/bin/jekyll:23:in `load'
   47: from /var/lib/gems/2.7.0/gems/jekyll-3.4.5/exe/jekyll:13:in 
`'
   46: from 
/var/lib/gems/2.7.0/gems/mercenary-0.3.6/lib/mercenary.rb:19:in `program'
   45: from 
/var/lib/gems/2.7.0/gems/mercenary-0.3.6/lib/mercenary/program.rb:42:in `go'
   44: from 
/var/lib/gems/2.7.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in 
`execute'
   43: from 
/var/lib/gems/2.7.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `each'
   42: from 
/var/lib/gems/2.7.0/gems/mercenary-0.3.6/lib/mercenary/command.rb:220:in `block 
in execute'
   41: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/commands/build.rb:16:in `block 
(2 levels) in init_with_program'
   40: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/commands/build.rb:34:in 
`process'
   39: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/commands/build.rb:63:in `build'
   38: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/command.rb:26:in `process_site'
   37: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/site.rb:69:in `process'
   36: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/site.rb:190:in `render'
   35: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/site.rb:449:in `render_docs'
   34: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/site.rb:449:in `each'
   33: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/site.rb:450:in `block in 
render_docs'
   32: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/site.rb:450:in `each'
   31: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/site.rb:452:in `block (2 
levels) in render_docs'
   30: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/renderer.rb:82:in `run'
   29: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/renderer.rb:134:in 
`render_liquid'
   28: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/liquid_renderer/file.rb:26:in 
`render!'
   27: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/liquid_renderer/file.rb:47:in 
`measure_time'
   26: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/liquid_renderer/file.rb:27:in 
`block in render!'
   25: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/liquid_renderer/file.rb:40:in 
`measure_bytes'
   24: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/liquid_renderer/file.rb:28:in 
`block (2 levels) in render!'
   23: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/template.rb:222:in `render!'
   22: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/template.rb:209:in `render'
   21: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/template.rb:262:in 
`with_profiling'
   20: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/template.rb:210:in `block in 
render'
   19: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/block.rb:108:in `render'
   18: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/block.rb:122:in `render_all'
   17: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/block.rb:122:in `each'
   16: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/block.rb:135:in `block in 
render_all'
   15: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/profiler/hooks.rb:4:in 
`render_token_with_profiling'
   14: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/profiler.rb:80:in 
`profile_token_render'
   13: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/profiler/hooks.rb:5:in `block 
in render_token_with_profiling'
   12: from 
/var/lib/gems/2.7.0/gems/liquid-3.0.6/lib/liquid/block.rb:151:in `render_token'
   11: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/tags/highlight.rb:39:in 
`render'
   10: from 
/var/lib/gems/2.7.0/gems/jekyll-3.4.5/lib/jekyll/tags/highlight.rb:89:in 
`render_pygments'
9: from /usr/lib/ruby/2.7.0/forwardable.rb:235:in `highlight'
8: from 
/var/lib/gems/2.7.0/gems/pygments.rb-1.2.1/lib/pygments/popen.rb:236:in 
`highlight'
7: from 
/var/lib/gems/2.7.0/gems/pygments.rb-1.2.1/lib/pygments/popen.rb:254:in `mentos'
6: from /usr/lib/ruby/2.7.0/timeout.rb:110:in `timeout'
5: from /usr/lib/ruby/2.7.0/timeout.rb:33:in `catch'
4: from /usr/lib/ruby/2.7.0/timeout.rb:33:in `catch'
3: from 

[GitHub] [samza] mynameborat commented on pull request #1635: SAMZA-2763: Support worker JVM opts for Samza Beam portable mode

2022-10-26 Thread GitBox


mynameborat commented on PR #1635:
URL: https://github.com/apache/samza/pull/1635#issuecomment-1292317691

   We will revisit this when we have Samza Beam portable support in oSS


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat closed pull request #1635: SAMZA-2763: Support worker JVM opts for Samza Beam portable mode

2022-10-26 Thread GitBox


mynameborat closed pull request #1635: SAMZA-2763: Support worker JVM opts for 
Samza Beam portable mode
URL: https://github.com/apache/samza/pull/1635


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] alnzng opened a new pull request, #1636: SAMZA-2762: new cpu usage metric which counts child processes usage

2022-10-25 Thread GitBox


alnzng opened a new pull request, #1636:
URL: https://github.com/apache/samza/pull/1636

   # Symptom
   
   We have observed that some use cases used quasar(TensorFlow framework) to do 
model inference and this framework spawn child processes(non-JVM) to run 
TensorFlow serving. These child processes were using high CPU usage(200%) 
however their CPU usage can't be captured by the existing CPU usage metric 
`process-cpu-usage`
   
   # Cause
   
   The existing metric `process-cpu-usage` metric was designed for capturing 
the [CPU usage for the JVM 
process](https://samza.apache.org/learn/documentation/1.6.0/operations/monitoring.html)
 only, it can't count the child processes(especially for non-JVM processes) 
usage.
   
   # Changes
   
   - Reply on [oshi framwork](https://www.oshi.ooo/) to capture the CPU usage 
for the JVM process and all its child processes, and create a new metric to 
display the total CPU usage.
   # API Changes
   
   - Added a new metric `total-process-cpu-usage` in `SamzaContainerMetrics` 
which is similar with [how we provided `physical-memory-mb` 
metric](https://github.com/apache/samza/pull/1530)
   
   # Tests
   
   - Unit tests
   - Tested with `samza-hello-samza` and verify the metric data points
   ![Screen Shot 2022-10-25 at 10 23 58 
PM](https://user-images.githubusercontent.com/59407935/197942249-21dae598-7e18-4bb0-88e5-6a752ca49765.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat opened a new pull request, #1635: SAMZA-2763: Support worker JVM opts for Samza Beam portable mode

2022-10-25 Thread GitBox


mynameborat opened a new pull request, #1635:
URL: https://github.com/apache/samza/pull/1635

   **Summary**: Support JVM options for worker process in Samza Beam portable 
mode
   **Description**: With portable mode support for Samza Beam, we want to tune 
and configure the JVM options for worker process. In this PR, we add support by 
introducing `worker.opts` configuration.
   **Changes**:
- Added `worker.opts` configuration
- Updated configuration table and website
   **API Changes**: None
   **Usage Instructions**: `worker.opts` can be used similar to other samza 
application configuration although it only applies to Samza Beam portable 
execution mode and is ignored otherwise.
   **Upgrade Instructions**: None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] sborya merged pull request #1634: Update zookeeper and kazoo version

2022-10-17 Thread GitBox


sborya merged PR #1634:
URL: https://github.com/apache/samza/pull/1634


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [samza] mynameborat commented on pull request #1633: Add GraalSystems in "powered-by"

2022-10-17 Thread GitBox


mynameborat commented on PR #1633:
URL: https://github.com/apache/samza/pull/1633#issuecomment-1281000897

   can you publish the website locally to make sure the changes look as 
expected? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@samza.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   5   6   7   8   9   10   >