All,
I think I'm wrapping up the major tickets for 4.x. Nicholas is
dramatically revamping grpc now.
I asked Claude to compare 3x with main. I have not verified these numbers
but they vibe as if correct (lol...).
● Summary: branch_3x vs main
Overall: -14,651 net lines (50,316 insertions, 64,967 deletions)
Major Wins ✅
| Area | Change | Notes
|
|-------------------|---------------|-------------------------------------------|
| tika-core | -12,192 lines | Config system overhauled, pipes
moved out |
| tika-config (XML) | -2,967 lines | Bespoke XML serialization removed
|
| tika-fuzzing | -3,279 lines | Removed
|
| tika-batch | -3,000+ lines | Removed (superseded by pipes)
|
| tika-eval | -1,571 lines | Simplified
|
| tika-server | -1,644 lines | Simplified
|
| tika-fork | -2,475 lines | Simplified/reorganized
|
Additions
| Area | Change | Notes |
|--------------------|--------------|----------------------------|
| tika-serialization | +6,606 lines | New JSON config (Jackson) |
| tika-pipes-core | ~2,000 lines | Cleaner pipes architecture |
| tika-plugins-core | ~1,200 lines | PF4J plugin system |
The JSON/Jackson approach is paying off - you've reduced the config system
by ~3,000 lines while getting better tooling support. The pipes refactoring
into plugins is cleaner architecturally even if LOC is similar.