yigal-rozenberg commented on issue #1582: URL: https://github.com/apache/iceberg-python/issues/1582#issuecomment-2631515059
Hi @Fokko, File encryption is an essential functionality that adds an immortal security layer to data integrity and security on a file level. Data Centric security addresses a different challenge in privacy and security of data. The idea is based on a concept where you protect and classify data during creation and carry the security properties across all systems without the need to re-encrypt and reclassify it. Traditional data security solutions rely on data discovery and classification based on the location of the data (server.db.schema.table.column), and the need to continuously track and re-identify data as it flows in the organization. Once the base protected data type is identified by the engine as protected and not just a structure or binary buffer, the engine layer can set min/max and even aggregation values in puffin files to further improve performance. Additionally these puffin fees can be extended to include bloom filters (based on the clear text values) and even inverted indexes to allow wild card and other operations over clear text data representation (this is where files encryption is important). I am still trying to understand the code base, and identify how and where I can implement supported operators according to the input data type to decipher the data and decode it to the original data type (e.g., binary->str, binary->date, etc.). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
