liferoad commented on code in PR #34943: URL: https://github.com/apache/beam/pull/34943#discussion_r2105020120
########## website/www/site/content/en/case-studies/akvelon.md: ########## @@ -17,3 +26,154 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> +<div class="case-study-opinion"> + <div class="case-study-opinion-img"> + <img src="/images/logos/powered-by/akvelon.png"/> + </div> + <blockquote class="case-study-quote-block"> + <p class="case-study-quote-text"> + “To support data privacy and pipeline reusability at scale, Akvelon developed Beam-based solutions for Protegrity and a major North American credit reporting company, enabling tokenization with Dataflow Flex Templates. Akvelon also built a CDAP Connector to integrate CDAP plugins with Apache Beam, enabling plugin reuse and multi-runtime compatibility.” + </p> + <div class="case-study-quote-author"> + <div class="case-study-quote-author-img"> + <img src="/images/case-study/akvelon/pikle.png"> + </div> + <div class="case-study-quote-author-info"> + <div class="case-study-quote-author-name"> + Ashley Pikle + </div> + <div class="case-study-quote-author-position"> + Director of AI Business Development @Akvelon + </div> + </div> + </div> + </blockquote> +</div> +<div class="case-study-post"> + +# Secure and Interoperable Apache Beam Pipelines by Akvelon + +## Background + +To meet growing enterprise needs for secure, scalable, and interoperable data processing pipelines, **Akvelon** developed multiple Apache Beam-powered solutions tailored for real-world production environments: +- Data tokenization and detokenization capabilities for **Protegrity** and a leading North American credit reporting company +- A connector layer to integrate **CDAP** plugins into Apache Beam pipelines + +By leveraging [Apache Beam](https://beam.apache.org/) and [Google Cloud Dataflow](https://cloud.google.com/products/dataflow?hl=en), Akvelon enabled its clients to achieve scalable data protection, regulatory compliance, and platform interoperability through reusable, open-source pipeline components. + +## Use Case 1: Data Tokenization for Protegrity and a Leading Credit Reporting Company + +### The Challenge + +**Protegrity**, a leading enterprise data-security vendor, sought to enhance its data protection platform with scalable tokenization support for batch and streaming data. Their goal: allow customers such as a major North American credit reporting company to tokenize sensitive data using Google Cloud Dataflow. The solution needed to be fast, secure, reusable, and compliant with privacy regulations (e.g., HIPAA, GDPR). + +### The Solution + +Akvelon designed and implemented a **Dataflow Flex Template** using Apache Beam that allows users to tokenize and detokenize sensitive data within both batch and streaming pipelines. + +<div class="post-scheme"> + <a href="/images/case-study/akvelon/diagram-01.png" target="_blank" title="Click to enlarge"> + <img src="/images/case-study/akvelon/diagram-01.png" alt="Protegrity & Equifax Tokenization Pipeline"> + </a> +</div> + +### Key features +- **Seamless integration with Protegrity UDFs**, enabling native tokenization directly within Beam transforms without requiring external service orchestration +- **Support for multiple data formats** such as CSV, JSON, Parquet, allowing flexible deployment across diverse data pipelines +- **Stateful processing with `DoFn` and timers**, which improves streaming reliability and reduces overall pipeline latency +- **Full compatibility with Google Cloud Dataflow**, ensuring autoscaling, fault tolerance, and operational simplicity through managed Apache Beam execution + +This design provided both Protegrity and its enterprise clients with a reusable, open-source architecture for scalable data privacy and processing. + +### The Results +- Enabled data tokenization at scale for regulated industries +- Accelerated adoption of Dataflow templates across Protegrity’s customer base +- Delivered an open-source Flex Template that benefits the entire Apache Beam community Review Comment: Do we have the github link for this template? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org