[ https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668759#action_12668759 ]
stack commented on HADOOP-3315: ------------------------------- Thanks for the above especially the bit on concurrent access blocked on FSDIS. Thanks too for the explaination of inBlockAdvance (Yes, no objects are created, just resets and closes). Pardon me. I'm wondering how you did your random access tests above? I see nowhere near your numbers. I'm seeing more like 2 to 6 random reads a second. Profling, "100%" of the time is being spent in inBlockAdvance. ~70% in TFile$Reader.checkKey and ~20% in TFile$Reader.compareKeys (About half of this time is down in hadoop.fsdoing seeking and reading). My test writes 1M rows whose key is the sequence numbers zero-padded. The rows are 1k of random data (File is about a 1G. TFile blocks are 64k). The random access test fetches a random key from the 1-1M range. Looking at the TestTFileSeek, I see that this test writes a 30M file by default and then does 1000 seeks. It reports seek times of about 2.5ms on average. If I up the file written to be 1G, leaving all else the same, now the seek time averages 120ms on average. What you think is going on here? My test against MapFile does about 417 random-reads a second. Thanks Hong. > New binary file format > ---------------------- > > Key: HADOOP-3315 > URL: https://issues.apache.org/jira/browse/HADOOP-3315 > Project: Hadoop Core > Issue Type: New Feature > Components: io > Reporter: Owen O'Malley > Assignee: Amir Youssefi > Fix For: 0.21.0 > > Attachments: HADOOP-3315_20080908_TFILE_PREVIEW_WITH_LZO_TESTS.patch, > HADOOP-3315_20080915_TFILE.patch, hadoop-trunk-tfile.patch, > hadoop-trunk-tfile.patch, TFile Specification 20081217.pdf > > > SequenceFile's block compression format is too complex and requires 4 codecs > to compress or decompress. It would be good to have a file format that only > needs -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.