noobarcitect edited a comment on issue #1586: URL: https://github.com/apache/hudi/issues/1586#issuecomment-683853156
I have been facing 1 more problem. The hudi dataset generated by spark job contains a .hoodie and default directory. Now I want to view it in AWS data cataogue and crawling this data through glue crawler. I have hit a roadblock now that my crawled table is having those extra metadata columns such as _hoodie_commit_time, _hoodie_coomit_seq_no, _hoodie_record_key etc. Is there a way to just get the data which contains only data columns and not these metadata columns ? Although I can read the dataset as df in another spark job and drop these extra columns but I want to avoid that. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
