apurtell commented on pull request #3787: URL: https://github.com/apache/hbase/pull/3787#issuecomment-950378049
100M chaos test completed with 8 splits in total (test cluster has region size limit at 1GB) and 100% verified data. I will contribute the new `IntegrationTestLoadSmallValues` on another PR. Training data, dictionary file, and table schema are described in the attached file from an earlier result. The training methodology is the same. What is different is the schema was uploaded to a real cluster and 100M rows were generated instead of 10M. I also had to add the Zstandard dictionary file header to the resulting dictionary, an internal detail of how the `zstd` utility works, which is not really important. See [IntegrationTestLoadSmallValues.pdf](https://github.com/apache/hbase/files/7406089/IntegrationTestLoadSmallValues.pdf) The values were loaded (compressed), regions were split, compacted, and moved (recompressed), and all cells were read back and verified (decompressed). 2021-10-24 04:10:33,554 INFO [main] mapreduce.Job: Job job_local173556728_0002 completed successfully 2021-10-24 04:10:33,558 INFO [main] mapreduce.Job: large read operations=0 Map-Reduce Framework Map input records=100000000 Map output records=0 Map output bytes=0 Map output materialized bytes=6 Input split bytes=97 Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=6 Reduce input records=0 Reduce output records=0 Spilled Records=0 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=2331 Total committed heap usage (bytes)=1551892480 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 org.apache.hadoop.hbase.test.IntegrationTestLoadSmallValues$Counts REFERENCED=100000000 File Input Format Counters Bytes Read=2022567586 File Output Format Counters Bytes Written=0 2021-10-24 04:10:33,559 INFO [main] test.IntegrationTestLoadSmallValues$Verify: REFERENCED: 100000000 2021-10-24 04:10:33,560 INFO [main] test.IntegrationTestLoadSmallValues$Verify: UNREFERENCED: 0 2021-10-24 04:10:33,560 INFO [main] test.IntegrationTestLoadSmallValues$Verify: CORRUPT: 0 /cc @virajjasani -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
