luccadibe opened a new pull request, #2394:
URL: https://github.com/apache/systemds/pull/2394

   This PR aims to fix and optimize the HDF5Reader implementation from systemds 
with the goal of being able to correctly read the So2Sat LCZ42 dataset 
(https://mediatum.ub.tum.de/1454690) .
   
   For this I added support for the  filter pipeline and attribute message 
types from HDF5 ;
    n dimensional matrices with n>2 are flattened into 2d .
   I also added support for inferring hdf5 from the .h5 file extension.
   
   Apologies for the massive PR :face_with_head_bandage: .
   I benchmarked the performance of the new implementation and shared results 
in this repo:
   https://github.com/luccadibe/systemds-hdf5-reader-benchmark
   
   The code still needs some work regarding code style and formatting ( I am 
not sure if I set up the fomatter correctly as mentioned in the CONTRIBUTING.md 
; in some files I was getting a huge diff so I  tried to format only what I 
touched). 
   
   I am unsure about how to best split this into multiple PRs , or if that  is 
wanted even. 
   I would appreciate some general feedback on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to