apurtell commented on pull request #3787:
URL: https://github.com/apache/hbase/pull/3787#issuecomment-950378049


   100M chaos test completed with 8 splits in total (test cluster has region 
size limit at 1GB) and 100% verified data. I will contribute the new 
`IntegrationTestLoadSmallValues` on another PR. 
   
   Training data, dictionary file, and table schema are described in the 
attached file from an earlier result. The training methodology is the same. 
What is different is the schema was uploaded to a real cluster and 100M rows 
were generated instead of 10M. I also had to add the Zstandard dictionary file 
header to the resulting dictionary, an internal detail of how the `zstd` 
utility works, which is not really important. See 
[IntegrationTestLoadSmallValues.pdf](https://github.com/apache/hbase/files/7406089/IntegrationTestLoadSmallValues.pdf)
   
   The values were loaded (compressed), regions were split, compacted, and 
moved (recompressed), and all cells were read back and verified (decompressed). 
   
       2021-10-24 04:10:33,554 INFO  [main] mapreduce.Job: Job 
job_local173556728_0002 completed successfully
       2021-10-24 04:10:33,558 INFO  [main] mapreduce.Job:  large read 
operations=0
        Map-Reduce Framework
                Map input records=100000000
                Map output records=0
                Map output bytes=0
                Map output materialized bytes=6
                Input split bytes=97
                Combine input records=0
                Combine output records=0
                Reduce input groups=0
                Reduce shuffle bytes=6
                Reduce input records=0
                Reduce output records=0
                Spilled Records=0
                Shuffled Maps =1
                Failed Shuffles=0
                Merged Map outputs=1
                GC time elapsed (ms)=2331
                Total committed heap usage (bytes)=1551892480
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        org.apache.hadoop.hbase.test.IntegrationTestLoadSmallValues$Counts
                REFERENCED=100000000
        File Input Format Counters 
                Bytes Read=2022567586
        File Output Format Counters 
                Bytes Written=0
       2021-10-24 04:10:33,559 INFO  [main] 
test.IntegrationTestLoadSmallValues$Verify: REFERENCED: 100000000
       2021-10-24 04:10:33,560 INFO  [main] 
test.IntegrationTestLoadSmallValues$Verify: UNREFERENCED: 0
       2021-10-24 04:10:33,560 INFO  [main] 
test.IntegrationTestLoadSmallValues$Verify: CORRUPT: 0
   
   /cc @virajjasani 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to