Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-14 Thread via GitHub


guan404ming merged PR #805:
URL: https://github.com/apache/mahout/pull/805


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-14 Thread via GitHub


viiccwen commented on code in PR #805:
URL: https://github.com/apache/mahout/pull/805#discussion_r2691750983


##
qdp/qdp-core/tests/preprocessing.rs:
##
@@ -53,8 +53,8 @@ fn test_validate_input_empty_data() {
 
 #[test]
 fn test_validate_input_data_too_large() {
-let data = vec![1.0, 0.0, 0.0]; // 3 elements
-let result = Preprocessor::validate_input(&data, 1); // max size 2^1 = 2
+let data = vec![1.0, 0.0, 0.0];
+let result = Preprocessor::validate_input(&data, 1);

Review Comment:
   thx for pointing out. I'll undo it later.



##
qdp/qdp-core/tests/preprocessing.rs:
##
@@ -63,7 +63,7 @@ fn test_validate_input_data_too_large() {
 #[test]
 fn test_validate_input_allows_partial_state() {
 let data = vec![0.5, -0.5, 0.25];
-assert!(Preprocessor::validate_input(&data, 3).is_ok()); // state vector 
can hold up to 8 elements
+assert!(Preprocessor::validate_input(&data, 3).is_ok());

Review Comment:
   thx for pointing out. I'll undo it later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-14 Thread via GitHub


viiccwen commented on code in PR #805:
URL: https://github.com/apache/mahout/pull/805#discussion_r2691741479


##
qdp/qdp-core/tests/preprocessing.rs:
##
@@ -105,3 +105,48 @@ fn test_calculate_l2_norm_matches_sequential_sum() {
 let norm_sequential = data.iter().map(|x| x * x).sum::().sqrt();
 assert!((norm_parallel - norm_sequential).abs() < 1e-10);
 }
+
+#[test]
+fn test_calculate_l2_norm_invalid_values() {
+let cases = [
+("NaN", f64::NAN, "NaN"),
+("+Inf", f64::INFINITY, "Infinity"),
+("-Inf", f64::NEG_INFINITY, "Infinity"),
+];
+
+for (label, bad_value, expected_fragment) in cases {
+let data = vec![1.0, bad_value, 3.0];
+let result = Preprocessor::calculate_l2_norm(&data);
+assert!(
+matches!(result, Err(MahoutError::InvalidInput(msg)) if 
msg.contains(expected_fragment)),
+"case {label} did not produce expected error"
+);
+}
+}
+
+#[test]
+fn test_calculate_batch_l2_norms_invalid_values() {
+let cases = [
+("NaN", f64::NAN, "NaN"),
+("+Inf", f64::INFINITY, "Infinity"),
+("-Inf", f64::NEG_INFINITY, "Infinity"),
+];
+
+for (label, bad_value, expected_fragment) in cases {
+let batch_data = vec![1.0, 2.0, bad_value, 4.0];
+let result = Preprocessor::calculate_batch_l2_norms(&batch_data, 2, 2);
+assert!(
+matches!(result, Err(MahoutError::InvalidInput(msg)) if 
msg.contains(expected_fragment)),
+"case {label} did not produce expected error"
+);
+}
+}
+
+#[test]
+fn test_calculate_batch_l2_norms_success() {
+let batch_data = vec![3.0, 4.0, 5.0, 12.0];
+let norms = Preprocessor::calculate_batch_l2_norms(&batch_data, 2, 
2).unwrap();
+assert_eq!(norms.len(), 2);
+assert!((norms[0] - 5.0).abs() < 1e-10); // sqrt(3^2 + 4^2) = 5
+assert!((norms[1] - 13.0).abs() < 1e-10); // sqrt(5^2 + 12^2) = 13

Review Comment:
   Thx for the suggestion!
   
   I followed the existing pattern used earlier in the code, but I agree that 
using `approx::assert_relative_eq` would improve readability. 
   I’ll add `approx` as a dev-dependency and update the tests accordingly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-14 Thread via GitHub


ryankert01 commented on code in PR #805:
URL: https://github.com/apache/mahout/pull/805#discussion_r2691686912


##
qdp/qdp-core/tests/preprocessing.rs:
##
@@ -53,8 +53,8 @@ fn test_validate_input_empty_data() {
 
 #[test]
 fn test_validate_input_data_too_large() {
-let data = vec![1.0, 0.0, 0.0]; // 3 elements
-let result = Preprocessor::validate_input(&data, 1); // max size 2^1 = 2
+let data = vec![1.0, 0.0, 0.0];
+let result = Preprocessor::validate_input(&data, 1);

Review Comment:
   whats the reason of removing comments?



##
qdp/qdp-core/tests/preprocessing.rs:
##
@@ -105,3 +105,48 @@ fn test_calculate_l2_norm_matches_sequential_sum() {
 let norm_sequential = data.iter().map(|x| x * x).sum::().sqrt();
 assert!((norm_parallel - norm_sequential).abs() < 1e-10);
 }
+
+#[test]
+fn test_calculate_l2_norm_invalid_values() {
+let cases = [
+("NaN", f64::NAN, "NaN"),
+("+Inf", f64::INFINITY, "Infinity"),
+("-Inf", f64::NEG_INFINITY, "Infinity"),
+];
+
+for (label, bad_value, expected_fragment) in cases {
+let data = vec![1.0, bad_value, 3.0];
+let result = Preprocessor::calculate_l2_norm(&data);
+assert!(
+matches!(result, Err(MahoutError::InvalidInput(msg)) if 
msg.contains(expected_fragment)),
+"case {label} did not produce expected error"
+);
+}
+}
+
+#[test]
+fn test_calculate_batch_l2_norms_invalid_values() {
+let cases = [
+("NaN", f64::NAN, "NaN"),
+("+Inf", f64::INFINITY, "Infinity"),
+("-Inf", f64::NEG_INFINITY, "Infinity"),
+];
+
+for (label, bad_value, expected_fragment) in cases {
+let batch_data = vec![1.0, 2.0, bad_value, 4.0];
+let result = Preprocessor::calculate_batch_l2_norms(&batch_data, 2, 2);
+assert!(
+matches!(result, Err(MahoutError::InvalidInput(msg)) if 
msg.contains(expected_fragment)),
+"case {label} did not produce expected error"
+);
+}
+}
+
+#[test]
+fn test_calculate_batch_l2_norms_success() {
+let batch_data = vec![3.0, 4.0, 5.0, 12.0];
+let norms = Preprocessor::calculate_batch_l2_norms(&batch_data, 2, 
2).unwrap();
+assert_eq!(norms.len(), 2);
+assert!((norms[0] - 5.0).abs() < 1e-10); // sqrt(3^2 + 4^2) = 5
+assert!((norms[1] - 13.0).abs() < 1e-10); // sqrt(5^2 + 12^2) = 13

Review Comment:
   nit: although completely understand what its doing, we can also use 
`approx::assert_relative_eq` to enhance its readability.



##
qdp/qdp-core/tests/preprocessing.rs:
##
@@ -63,7 +63,7 @@ fn test_validate_input_data_too_large() {
 #[test]
 fn test_validate_input_allows_partial_state() {
 let data = vec![0.5, -0.5, 0.25];
-assert!(Preprocessor::validate_input(&data, 3).is_ok()); // state vector 
can hold up to 8 elements
+assert!(Preprocessor::validate_input(&data, 3).is_ok());

Review Comment:
   ditto.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-12 Thread via GitHub


viiccwen commented on PR #805:
URL: https://github.com/apache/mahout/pull/805#issuecomment-3742346094

   @ryankert01 Thx for pointing that out, and apologies for not discussing this 
first before opening the PR.
   
   Sure, even if we don’t have a clear release/debug distinction here, an O(1) 
validation step could still make sense.
   I should have raised this for discussion earlier instead of jumping straight 
into an implementation.
   
   I’ll make sure to bring these kinds of changes up for discussion first and 
avoid wasting others time : ) 
   Thanks again for the feedback!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-12 Thread via GitHub


viiccwen commented on PR #805:
URL: https://github.com/apache/mahout/pull/805#issuecomment-3742327615

   > thanks for the patch @viiccwen ! Doing validate_input and then 
calculate_l2_norm in debug mode is basically two full O(N) passes. Once you get 
into ~20+ qubits, that can definitely make local runs feel sluggish. I think we 
could do with the IEEE 754 behavior approach: NaNs/Infs will naturally "poison" 
the sum during the norm calculation, so you're already getting most of that 
validation for free:
   > 
   > * Any NaN in the input → sum of squares becomes NaN
   > * Any Inf in the input → sum becomes Inf
   > 
   > So yeah, in most cases we can drop the separate check_numerical_safety 
loop and just rely on the end-of-calculate_l2_norm guard (lines 86–96) to catch:
   > 
   > 1. Invalid inputs (NaN/Inf)
   > 2. Overflow during accumulation
   > 
   > That gives you a single O(N) pass, keeps things cleaner, and speeds up 
debug builds without affecting release. WDYT?
   
   Agree. I hadn’t considered IEEE 754 propagation behavior. It'll be removed 
later.
   And also agree to keep the nan/inf guard, although we've checked nan/inf 
situation later stages:
   
   
https://github.com/apache/mahout/blob/bff0b88ef38808263a65a24a8d2ee2352bd0e1b4/qdp/qdp-kernels/src/amplitude.cu#L147-L149
   
   It should help catch invalid values early in the pipeline as well : ) 
   Thanks for the suggestion!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-12 Thread via GitHub


rich7420 commented on PR #805:
URL: https://github.com/apache/mahout/pull/805#issuecomment-3741575096

   Doing validate_input and then calculate_l2_norm in debug mode is basically 
two full O(N) passes. Once you get into ~20+ qubits, that can definitely make 
local runs feel sluggish.
   I think we could do with the IEEE 754 behavior approach: NaNs/Infs will 
naturally "poison" the sum during the norm calculation, so you're already 
getting most of that validation for free:
   
   - Any NaN in the input → sum of squares becomes NaN
   - Any Inf in the input → sum becomes Inf
   
   So yeah, in most cases we can drop the separate check_numerical_safety loop 
and just rely on the end-of-calculate_l2_norm guard (lines 86–96) to catch:
   
   1. Invalid inputs (NaN/Inf)
   2. Overflow during accumulation
   
   That gives you a single O(N) pass, keeps things cleaner, and speeds up debug 
builds without affecting release.
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-11 Thread via GitHub


viiccwen commented on PR #805:
URL: https://github.com/apache/mahout/pull/805#issuecomment-3736856702

   Thx! 🙌


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-11 Thread via GitHub


ryankert01 commented on PR #805:
URL: https://github.com/apache/mahout/pull/805#issuecomment-3736840162

   Not sure if we have a release build, but validate norm is O(1), so I think 
we can have that! Not sure how it should be done though. 
   Let me cc a quantum guy @rich7420 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-11 Thread via GitHub


viiccwen commented on PR #805:
URL: https://github.com/apache/mahout/pull/805#issuecomment-3736832891

   Sure, what if we enable it only in dev environment? 🤔
   we get strict validation during development, but it compiles away to a 
zero-cost op in release builds.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] [QDP] Enhance numerical safety checks in preprocessor [mahout]

2026-01-11 Thread via GitHub


ryankert01 commented on PR #805:
URL: https://github.com/apache/mahout/pull/805#issuecomment-3736822479

   To my knowledge, pennylane do not validate, although it's a nice-to-have. 
It's a O(N), so I will suggest we do it later than benchmarking.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]