Hi Devs,


Recently we worked with Spark community to implement the shuffle encryption. 
While implementing that, we realized some/most of the code in Apache Hadoop 
encryption code and this implementation code have to be duplicated. This leads 
to an idea to create separate reusable library, named it as Chimera 
(https://github.com/intel-hadoop/chimera). It is an optimized cryptographic 
library. It provides Java API for both cipher level and Java stream level to 
help developers implement high performance AES encryption/decryption with the 
minimum code and effort.



We know that Java has Cipher implementations, but why we need this optimized 
cryptographic library:

1. Performance is very critical for encryption and decryption. The JDK Cipher 
implementation of AES are not yet optimized with the modern hardware. For 
example, the optimized implementation is 17x+ faster than JDK6 implementations 
for some modes such as CBC decryption, CTR and GCM. Even some optimizations has 
included JDK7 or JDK8, there are still 5x to 6x gap with the most optimized 
implementations.

2. Java Stream based API on cryptographic data stream. Cipher API is powerful 
but a lot of code needs to be written for layered stream processing 
applications. The design pattern is very common in modern applications such as 
Hadoop or Spark.



Chimera was originally based Hadoop crypto code but was improved and 
generalized a lot for supporting wider scope of data encryption needs for more 
components in the community. The encryption related code in Hadoop was 
developed a year and so far it is running well. So we feel that code part of 
stable enough already.



So, we propose to contribute this Chimera (optimized encryption library) code 
to Apache Commons and we wanted to have independent release cycles for this 
module like any other modules in Apache Commons. This module is basically 
provides Java based interfaces for encryption based IO and It will have native 
based AES-NI encryption integration code.



We already discussed about this proposal in Apache Hadoop dev lists and the 
discussion conclusion was positive to contribute this module to Apache Commons.



We need your help and support in adopting this code to make as Apache Commons 
sub module and establish for making it to have its own development community 
(of course we can discuss more about this factors in this thread). And Hadoop 
and Spark will be the two visible projects to use it. We do expect there will 
be more projects using it.



Once Apache Commons PMC agreed to place this module under Commons, I will work 
on getting the interested developers etc for establishing Chimera development 
community as part of next steps. Please help on the process.



Regards,

Uma (An Apache Hadoop PMC member)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to