tanmayrauth commented on PR #1131:
URL: https://github.com/apache/iceberg-go/pull/1131#issuecomment-4562558140

   @laskoviymishka — fair points across the board. Honestly I had some doubts 
about the scope of "cross-client" before I opened this and should have 
clarified up front rather than letting you spell it out in  review. Sorry for 
the round trip. Four things I want to lock down so the rewrite lands in one 
pass:
     
     1. Disposition of the three current tests. My read: drop 
TestCrossClientDVWriterProducesSpecCompliantBlob and 
TestCrossClientMultiFileDVPuffin — they're Go-writes-Go-reads and already 
covered by TestDVWriterSingleDataFile / TestDVWriterMultipleDataFiles — and 
replace them with two tests that read Java-produced Puffin fixtures 
(single-blob and multi-blob) end-to-end through ReadDV. For 
TestCrossClientGoSerializeMatchesJavaFormat, scope-down + rename rather than 
delete — keep it narrowly as "Go envelope matches Java for this 
single-array-container fixture" with a comment that explicitly calls out the 
RLE divergence. Or would you rather see Test 2 gone too in favour of a single 
Java-Puffin readback?
     2. Java generator location. I have a one-off GenerateDVFixtures.java + 
generate.sh that produces the .puffin bytes — it has to live inside an 
apache/iceberg checkout to compile against the package-private 
BitmapPositionDeleteIndex. Two options: commit it under dev/dv-fixtures/ for 
reproducibility, or keep it out and ship just the .puffin bytes + regen 
instructions in testdata/deletes/README.md. Preference?
     3. Envelope-fixture boundaries. Existing 64maphighvals.bin covers the 
>2^32 second-roaring-key case. Do you also want a fresh fixture crossing the 
4096-entry array→bitmap container threshold, or do 64mapspreadvals.bin / 
64map32bitvals.bin already cover mixed-container cases enough for your purposes?
     4. Boundary-crossing assertions. I'm planning 
read-back-and-assert-positions on the new fixtures rather than byte-equality, 
since BitmapPositionDeleteIndex.serialize()'s runLengthEncode() step makes 
byte-match brittle for any range-style fixture. Sound right?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to