Fengzdadi opened a new issue, #90:
URL: https://github.com/apache/datasketches-go/issues/90

   ### Description
   
   I'd like to contribute an implementation of `ReservoirLongsSketch` to the Go 
library. This would address the ❌ status for "ReservoirLongsSketch" in the 
README.
   
   ### Proposed Implementation
   
   Based on the Java reference implementation, I've created:
   
   | File | Description |
   |------|-------------|
   | `sampling/reservoir_longs_sketch.go` | Core reservoir sampling for `int64` 
values |
   | `sampling/reservoir_longs_union.go` | Union for merging multiple sketches |
   | `sampling/reservoir_longs_sketch_test.go` | Unit tests (11 tests) |
   | `examples/reservoir_example_test.go` | Usage examples |
   
   ### Algorithm
   
   The classic **Reservoir Sampling** algorithm (Vitter's Algorithm R):
   
   1. **Initial Phase** (n < k): Store all items
   2. **Steady State** (n ≥ k): Replace random item with probability k/n
   
   ### API
   
   ```go
   // Create sketch with capacity k
   sketch, _ := sampling.NewReservoirLongsSketch(10)
   
   // Add items
   sketch.Update(42)
   
   // Get uniform random sample
   samples := sketch.GetSamples()
   
   // Serialization
   bytes, _ := sketch.ToByteArray()
   restored, _ := sampling.NewReservoirLongsSketchFromSlice(bytes)
   ```
   
   ### Feedback Requested
   
   I have a working implementation ready. Before submitting a PR, I'd 
appreciate feedback on:
   
   1. **Serialization Format**: I followed the general pattern from 
`PreambleUtil.java`. Should I verify cross-language compatibility with specific 
test cases?
   2. **Scope**: Should I include `ReservoirItemsSketch<T>` (generic version) 
in the same PR, or keep it as a separate contribution?
   3. **Any design concerns** with the current approach?
   
   ### Testing
   
   All tests pass locally:
   ```
   go test -v ./sampling/ ./examples/
   # 13 tests pass
   ```
   
   I'm happy to adjust the implementation based on your feedback!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to