steveloughran opened a new pull request, #6995:
URL: https://github.com/apache/hadoop/pull/6995

   
   1. The class WrappedIO has been extended with more filesystem operations
   
   - openFile()
   - PathCapabilities
   - StreamCapabilities
   - ByteBufferPositionedReadable
   
   All these static methods raise UncheckedIOExceptions rather than checked 
ones.
   
   2. The adjacent class org.apache.hadoop.io.wrappedio.WrappedStatistics 
provides similar access to IOStatistics/IOStatisticsContext classes and 
operations.
   
   Allows callers to:
   * Get a serializable IOStatisticsSnapshot from an IOStatisticsSource or 
IOStatistics instance
   * Save an IOStatisticsSnapshot to file
   * Convert an IOStatisticsSnapshot to JSON
   * Given an object which may be an IOStatisticsSource, return an object whose 
toString() value is a dynamically generated, human readable summary. This is 
for logging.
   * Separate getters to the different sections of IOStatistics.
   * Mean values are returned as a Map.Pair<Long, Long> of (samples, sum) from 
which means may be calculated.
   
   There are examples of the dynamic bindings to these classes in:
   
   org.apache.hadoop.io.wrappedio.impl.DynamicWrappedIO 
org.apache.hadoop.io.wrappedio.impl.DynamicWrappedStatistics
   
   These use DynMethods and other classes in the package 
org.apache.hadoop.util.dynamic which are based on the Apache Parquet 
equivalents.
   This makes re-implementing these in that library and others which their own 
fork of the classes (example: Apache Iceberg)
   
   3. The openFile() option "fs.option.openfile.read.policy" has added specific 
file format policies for the core filetypes
   
   * avro
   * columnar
   * csv
   * hbase
   * json
   * orc
   * parquet
   
   S3A chooses the appropriate sequential/random policy as a 
   
   A policy `parquet, columnar, vector, random, adaptive` will use the parquet 
policy for any filesystem aware of it, falling back to the first entry in the 
list which the specific version of the filesystem recognizes
   
   4. New Path capability fs.capability.virtual.block.locations
   
   Indicates that locations are generated client side and don't refer to real 
hosts.
   
   Contributed by Steve Loughran
   
   <!--
     Thanks for sending a pull request!
       1. If this is your first time, please read our contributor guidelines: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
       2. Make sure your PR title starts with JIRA issue id, e.g., 
'HADOOP-17799. Your PR title ...'.
   -->
   
   ### Description of PR
   
   
   ### How was this patch tested?
   
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to